|
.devops
|
docker : add CUDA 13.1 image build (#18441)
|
2025-12-30 22:28:53 +01:00 |
|
.gemini
|
contributing: tighten AI usage policy (#18388)
|
2025-12-29 16:01:32 +01:00 |
|
.github
|
docker : add CUDA 13.1 image build (#18441)
|
2025-12-30 22:28:53 +01:00 |
|
benches/dgx-spark
|
benches : add eval results (#17139)
|
2025-11-10 10:44:10 +02:00 |
|
ci
|
llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653)
|
2025-12-15 09:24:59 +01:00 |
|
cmake
|
cmake : simplify build info detection using standard variables (#17423)
|
2025-12-04 12:42:13 +02:00 |
|
common
|
common : default content to an empty string (#18485)
|
2025-12-30 12:00:57 -06:00 |
|
docs
|
metal : add count_equal op (#18314)
|
2025-12-31 10:39:48 +02:00 |
|
examples
|
model-conversion : use CONVERTED_MODEL for compare-embeddings (#18461)
|
2025-12-30 10:13:12 +01:00 |
|
ggml
|
metal : add count_equal op (#18314)
|
2025-12-31 10:39:48 +02:00 |
|
gguf-py
|
model : add Qwen3-Omni multimodal architecture support
|
2025-12-31 20:25:55 +10:00 |
|
grammars
|
docs : document that JSON Schema is not available to model when using response_format (#18492)
|
2025-12-30 15:13:49 -06:00 |
|
include
|
lora: count lora nodes in graph_max_nodes (#18469)
|
2025-12-30 15:53:12 +01:00 |
|
licenses
|
cmake : enable curl by default (#12761)
|
2025-04-07 13:35:19 +02:00 |
|
media
|
media : add transparent icon svg and png [no ci] (#15891)
|
2025-09-10 14:51:28 +03:00 |
|
models
|
common : default content to an empty string (#18485)
|
2025-12-30 12:00:57 -06:00 |
|
pocs
|
ggml : move AMX to the CPU backend (#10570)
|
2024-11-29 21:54:58 +01:00 |
|
requirements
|
convert : update transformers requirements (#16866)
|
2025-10-30 23:15:03 +01:00 |
|
scripts
|
ggml-hexagon: create generalized functions for cpu side op (#17500)
|
2025-12-22 23:13:24 -08:00 |
|
src
|
model : add Qwen3-Omni multimodal architecture support
|
2025-12-31 20:25:55 +10:00 |
|
tests
|
common : default content to an empty string (#18485)
|
2025-12-30 12:00:57 -06:00 |
|
tools
|
model : add Qwen3-Omni multimodal architecture support
|
2025-12-31 20:25:55 +10:00 |
|
vendor
|
cmake: correct scope - link ws2_32 for MinGW/w64devkit builds in cpp-httplib (#17972)
|
2025-12-13 12:46:36 +01:00 |
|
.clang-format
|
fix: apply clang-format to CUDA macros (#16017)
|
2025-09-16 08:59:19 +02:00 |
|
.clang-tidy
|
clang-tidy : disable warning about performance enum size (#16127)
|
2025-09-22 19:57:46 +02:00 |
|
.dockerignore
|
ci : fix docker build number and tag name (#9638)
|
2024-09-25 17:26:01 +02:00 |
|
.ecrc
|
common : Update stb_image.h to latest version (#9161)
|
2024-08-27 08:58:50 +03:00 |
|
.editorconfig
|
editorconfig : ignore benches/ (#17140)
|
2025-11-10 12:17:19 +02:00 |
|
.flake8
|
llama : move end-user examples to tools directory (#13249)
|
2025-05-02 20:27:13 +02:00 |
|
.gitignore
|
vulkan: faster q6_k matmul (#17813)
|
2025-12-14 08:29:37 +01:00 |
|
.gitmodules
|
ggml : remove kompute backend (#14501)
|
2025-07-03 07:48:32 +03:00 |
|
.pre-commit-config.yaml
|
convert.py : add python logging instead of print() (#6511)
|
2024-05-03 22:36:41 +03:00 |
|
AGENTS.md
|
contributing: tighten AI usage policy (#18388)
|
2025-12-29 16:01:32 +01:00 |
|
AUTHORS
|
authors : update (#12271)
|
2025-03-08 18:26:00 +02:00 |
|
CLAUDE.md
|
contributing: tighten AI usage policy (#18388)
|
2025-12-29 16:01:32 +01:00 |
|
CMakeLists.txt
|
build : move _WIN32_WINNT definition to headers (#17736)
|
2025-12-04 07:04:02 +01:00 |
|
CMakePresets.json
|
cmake : Add CMake presets for Linux and GCC (#14656)
|
2025-07-13 08:12:36 +03:00 |
|
CODEOWNERS
|
llama.android : Rewrite Android binding (w/o cpu_features dep) (#17413)
|
2025-12-17 10:14:47 +02:00 |
|
CONTRIBUTING.md
|
contributing: tighten AI usage policy (#18388)
|
2025-12-29 16:01:32 +01:00 |
|
LICENSE
|
license : update copyright notice + add AUTHORS (#6405)
|
2024-04-09 09:23:19 +03:00 |
|
Makefile
|
make : remove make in favor of CMake (#15449)
|
2025-08-20 13:31:16 +03:00 |
|
README.md
|
docs: simplify - server-first approach
|
2025-12-31 20:41:56 +10:00 |
|
SECURITY.md
|
security : add collaborator guidance (#18081)
|
2025-12-16 11:17:11 +02:00 |
|
build-xcframework.sh
|
cmake : move OpenSSL linking to vendor/cpp-httplib (#17177)
|
2025-11-12 12:32:50 +01:00 |
|
convert_hf_to_gguf.py
|
model : add Qwen3-Omni multimodal architecture support
|
2025-12-31 20:25:55 +10:00 |
|
convert_hf_to_gguf_update.py
|
model : Granite Embedding support (#15641)
|
2025-12-23 00:28:19 +01:00 |
|
convert_llama_ggml_to_gguf.py
|
py : fix wrong input type for raw_dtype in ggml to gguf scripts (#8928)
|
2024-08-16 13:36:30 +03:00 |
|
convert_lora_to_gguf.py
|
convert : allow quantizing lora again (#17453)
|
2025-11-24 15:50:55 +01:00 |
|
flake.lock
|
flake.lock: Update (#10470)
|
2024-11-24 08:03:25 -08:00 |
|
flake.nix
|
fix(nix): remove non-functional llama-cpp cachix cache from flake.nix (#15295)
|
2025-08-13 11:21:31 -07:00 |
|
mypy.ini
|
convert : partially revert PR #4818 (#5041)
|
2024-01-20 18:14:18 -05:00 |
|
poetry.lock
|
build(python): Package scripts with pip-0517 compliance
|
2024-07-04 15:39:13 +00:00 |
|
pyproject.toml
|
gguf-py : avoid requiring pyside6 for other scripts (#13036)
|
2025-05-05 22:27:31 -04:00 |
|
pyrightconfig.json
|
model-conversion : use CONVERTED_MODEL value for converted model [no ci] (#17984)
|
2025-12-13 08:34:26 +01:00 |
|
requirements.txt
|
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
|
2025-03-05 13:05:13 +00:00 |