llama.cpp

History

Daniel Bevenius 86587da03b llama : check returned fn ptrs from ggml_backend_reg_get_proc_address (#15893 ) This commit adds check for two function pointers returned from ggml_backend_reg_get_proc_address. The motivation for this is that the function pointer could be nullptr if the get proc address function changes in the future. This is also consistent with all the other calls to ggml_backend_reg_get_proc_address in the code base.		2025-09-10 05:33:58 +02:00
..
CMakeLists.txt	kv-cache : drop the "unified" prefix (#15467 )	2025-08-21 17:00:33 +03:00
llama-adapter.cpp	aLoRA Support (#15327 )	2025-09-05 17:32:39 -06:00
llama-adapter.h	aLoRA Support (#15327 )	2025-09-05 17:32:39 -06:00
llama-arch.cpp	aLoRA Support (#15327 )	2025-09-05 17:32:39 -06:00
llama-arch.h	aLoRA Support (#15327 )	2025-09-05 17:32:39 -06:00
llama-batch.cpp	perplexity : provide a helpful hint for has_cpl case in split_equal error. (#15304 )	2025-08-14 14:03:30 +03:00
llama-batch.h	llama : reuse compute graphs (#14482 )	2025-07-17 19:08:33 +03:00
llama-chat.cpp	model : add support for Seed-OSS (#15490 )	2025-08-23 15:21:52 +02:00
llama-chat.h	model : add support for Seed-OSS (#15490 )	2025-08-23 15:21:52 +02:00
llama-context.cpp	llama : check returned fn ptrs from ggml_backend_reg_get_proc_address (#15893 )	2025-09-10 05:33:58 +02:00
llama-context.h	llama : separate compute buffer reserve from fattn check (#15696 )	2025-08-31 15:49:03 +02:00
llama-cparams.cpp	cparams : rename LLAMA_MAX_PARALLEL_SEQUENCES to LLAMA_MAX_SEQ (#14188 )	2025-06-15 10:08:58 +03:00
llama-cparams.h	llama : remove KV cache defragmentation logic (#15473 )	2025-08-22 12:22:13 +03:00
llama-grammar.cpp	`server`: streaming of tool calls and thoughts when `--jinja` is on (#12379 )	2025-05-25 01:48:08 +01:00
llama-grammar.h	`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034 )	2025-03-05 13:05:13 +00:00
llama-graph.cpp	context : fix n_outputs during reserve (#15858 )	2025-09-08 10:26:36 +03:00
llama-graph.h	llama : add support for EmbeddingGemma 300m (#15798 )	2025-09-04 18:10:29 +02:00
llama-hparams.cpp	kv-cache : fix SWA checks + disable cacheless iSWA (#15811 )	2025-09-05 10:39:22 +03:00
llama-hparams.h	kv-cache : fix SWA checks + disable cacheless iSWA (#15811 )	2025-09-05 10:39:22 +03:00
llama-impl.cpp	GGUF: C++ refactor, backend support, misc fixes (#11030 )	2025-01-07 18:01:58 +01:00
llama-impl.h	llama: use FA + max. GPU layers by default (#15434 )	2025-08-30 16:32:10 +02:00
llama-io.cpp	llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )	2025-03-13 12:35:44 +02:00
llama-io.h	llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )	2025-03-13 12:35:44 +02:00
llama-kv-cache-iswa.cpp	kv-cache : fix SWA checks + disable cacheless iSWA (#15811 )	2025-09-05 10:39:22 +03:00
llama-kv-cache-iswa.h	kv-cache : support layer reuse (#15504 )	2025-08-24 13:07:07 +03:00
llama-kv-cache.cpp	model : avoid ggml_cont_3d for fused QKV weights (#15662 )	2025-09-08 10:25:33 +03:00
llama-kv-cache.h	model : avoid ggml_cont_3d for fused QKV weights (#15662 )	2025-09-08 10:25:33 +03:00
llama-kv-cells.h	llama : remove KV cache defragmentation logic (#15473 )	2025-08-22 12:22:13 +03:00
llama-memory-hybrid.cpp	kv-cache : fix SWA checks + disable cacheless iSWA (#15811 )	2025-09-05 10:39:22 +03:00
llama-memory-hybrid.h	kv-cache : fix SWA checks + disable cacheless iSWA (#15811 )	2025-09-05 10:39:22 +03:00
llama-memory-recurrent.cpp	kv-cache : support layer reuse (#15504 )	2025-08-24 13:07:07 +03:00
llama-memory-recurrent.h	kv-cache : support layer reuse (#15504 )	2025-08-24 13:07:07 +03:00
llama-memory.cpp	memory : correctly handle failure in apply() (#14438 )	2025-06-30 18:03:03 +03:00
llama-memory.h	kv-cache : support layer reuse (#15504 )	2025-08-24 13:07:07 +03:00
llama-mmap.cpp	llama : allow using mmap without PrefetchVirtualMemory, apply GGML_WIN_VER to llama.cpp sources (#14013 )	2025-06-05 11:57:42 +02:00
llama-mmap.h	llama-mmap: fix missing include (#11796 )	2025-02-10 20:58:18 +02:00
llama-model-loader.cpp	nvidia nemotron nano v2 (nemotronh) (#15507 )	2025-08-28 18:39:31 -06:00
llama-model-loader.h	model: support GLM 4.5 family of models (#14939 )	2025-08-04 20:29:25 +02:00
llama-model-saver.cpp	llama : improve sep token handling (#14272 )	2025-06-20 14:04:09 +02:00
llama-model-saver.h	llama/ggml: add LLM training support (#10544 )	2025-05-12 14:44:49 +02:00
llama-model.cpp	model : avoid ggml_cont_3d for fused QKV weights (#15662 )	2025-09-08 10:25:33 +03:00
llama-model.h	llama : fix incorrect model type for Gemma 270M (#15764 )	2025-09-03 13:35:49 +02:00
llama-quant.cpp	convert : support non-mxfp4 HF model (#15153 )	2025-08-07 23:26:03 +02:00
llama-quant.h	llama : refactor `src/llama.cpp` (#10902 )	2025-01-03 10:18:53 +02:00
llama-sampling.cpp	sampling : optimize dist sampler (#15704 )	2025-09-03 18:16:26 +03:00
llama-sampling.h	llama : add `llama_vocab`, functions -> methods, naming (#11110 )	2025-01-12 11:32:42 +02:00
llama-vocab.cpp	model : jina-embeddings-v3 support (#13693 )	2025-08-28 15:49:50 +02:00
llama-vocab.h	model : add hunyuan dense (#14878 )	2025-08-01 15:31:12 +02:00
llama.cpp	llama : check returned fn ptrs from ggml_backend_reg_get_proc_address (#15893 )	2025-09-10 05:33:58 +02:00
unicode-data.cpp	server : better security control for public deployments (#9776 )	2024-10-08 13:27:04 +02:00
unicode-data.h	llama : reduce compile time and binary size (#9712 )	2024-10-02 15:49:55 +02:00
unicode.cpp	model : add Kimi-K2 support (#14654 )	2025-07-15 21:54:22 +02:00
unicode.h	model : add Kimi-K2 support (#14654 )	2025-07-15 21:54:22 +02:00