llama.cpp/.github
Johannes Gäßler b1f3a6e5db
llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653)
* llama: automatically fit args to free memory

llama-fit-params tool

* fix CI

* hints for bug reports, ensure no reallocation

* fix segfault with Vulkan

* add llama-fit-params to CI

* fix CI

* fix CI

* fix CI

* minor adjustments

* fix assignment of 1 dense layer

* fix logger not being reset on model load failure

* remove --n-gpu-layer hint on model load failure

* fix llama-fit-params verbosity

* fix edge case

* fix typo [no ci]
2025-12-15 09:24:59 +01:00
..
ISSUE_TEMPLATE llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653) 2025-12-15 09:24:59 +01:00
actions ci : add windows-cuda 13.1 release (#17839) 2025-12-07 14:02:04 +01:00
workflows vulkan: faster q6_k matmul (#17813) 2025-12-14 08:29:37 +01:00
copilot-instructions.md readme : add RVV,ZVFH,ZFH,ZICBOP support for RISC-V (#17259) 2025-11-14 09:12:56 +02:00
labeler.yml ci : apply model label to models (#16994) 2025-11-04 12:29:39 +01:00
pull_request_template.md repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00