[AI] android: fix infinite generation in shift_context()

When shift_context() discards tokens to free KV cache space, it decrements
current_position but not stop_generation_position. This causes the
termination check (current_position >= stop_generation_position) to never
trigger, resulting in infinite text generation.

Fix by also decrementing stop_generation_position by n_discard tokens.

Fixes #18409
This commit is contained in:
Samaresh Kumar Singh 2025-12-28 12:14:46 -06:00
parent 9c675c7140
commit 7e9bea7f1c
1 changed files with 1 additions and 0 deletions

View File

@ -283,6 +283,7 @@ static void shift_context() {
llama_memory_seq_rm(llama_get_memory(g_context), 0, system_prompt_position, system_prompt_position + n_discard);
llama_memory_seq_add(llama_get_memory(g_context), 0, system_prompt_position + n_discard, current_position, -n_discard);
current_position -= n_discard;
stop_generation_position -= n_discard;
LOGi("%s: Context shifting done! Current position: %d", __func__, current_position);
}