Properly maintain stop_generation_position when context is shifted. Additionally, add variable attention_sink to maintain attention. Adjust the position of the function after the variable stop_generation_position is defined. Fixes #18409. |
||
|---|---|---|
| .. | ||
| androidTest/java/android/llama/cpp | ||
| main | ||
| test/java/android/llama/cpp | ||