[AI] android: fix infinite generation in shift_context()
When shift_context() discards tokens to free KV cache space, it decrements current_position but not stop_generation_position. This causes the termination check (current_position >= stop_generation_position) to never trigger, resulting in infinite text generation. Fix by also decrementing stop_generation_position by n_discard tokens. Fixes #18409
This commit is contained in:
parent
9c675c7140
commit
7e9bea7f1c
|
|
@ -283,6 +283,7 @@ static void shift_context() {
|
|||
llama_memory_seq_rm(llama_get_memory(g_context), 0, system_prompt_position, system_prompt_position + n_discard);
|
||||
llama_memory_seq_add(llama_get_memory(g_context), 0, system_prompt_position + n_discard, current_position, -n_discard);
|
||||
current_position -= n_discard;
|
||||
stop_generation_position -= n_discard;
|
||||
LOGi("%s: Context shifting done! Current position: %d", __func__, current_position);
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Reference in New Issue