Add a new flag LLAMA_STATE_SEQ_FLAGS_APPEND (value 2) that skips the seq_rm() call in state_read_meta, allowing incremental chunk-by-chunk restore to the same sequence via repeated state_seq_set_data_ext calls. This enables external KV cache systems to restore opaque state blobs one chunk at a time without each chunk clearing the previous one. - Add #define LLAMA_STATE_SEQ_FLAGS_APPEND 2 in llama.h - Thread flags parameter through state_read() to state_read_meta() - Gate seq_rm() on !(flags & LLAMA_STATE_SEQ_FLAGS_APPEND) - Default behavior (flags=0) is unchanged |
||
|---|---|---|
| .. | ||
| llama-cpp.h | ||
| llama.h | ||