When shift_context() discards tokens to free KV cache space, it decrements current_position but not stop_generation_position. This causes the termination check (current_position >= stop_generation_position) to never trigger, resulting in infinite text generation. Fix by also decrementing stop_generation_position by n_discard tokens. Fixes #18409 |
||
|---|---|---|
| .. | ||
| app | ||
| gradle | ||
| lib | ||
| .gitignore | ||
| build.gradle.kts | ||
| gradle.properties | ||
| gradlew | ||
| settings.gradle.kts | ||