* kv-cache : pad the size of the small SWA cache for performance * context : pad the total context to 256 * cont : future-proof the swa pad * server : adjust test params to new logic |
||
|---|---|---|
| .. | ||
| llama-cpp.h | ||
| llama.h | ||
* kv-cache : pad the size of the small SWA cache for performance * context : pad the total context to 256 * cont : future-proof the swa pad * server : adjust test params to new logic |
||
|---|---|---|
| .. | ||
| llama-cpp.h | ||
| llama.h | ||