This includes equal-sequence-length batch splits which are useful to simplify recurrent model operators. * llama : always make recurrent state slots contiguous * ggml : simplify mamba operators |
||
|---|---|---|
| .. | ||
| cmake | ||
| include | ||
| src | ||
| CMakeLists.txt | ||
| ggml_vk_generate_shaders.py | ||