llama.cpp/src
Daniel Bevenius ca0ef2dddb
llama : clarify comment about pp and tg graphs [no ci] (#14895)
* llama : clarify comment about pp and tg graphs [no ci]

This commit clarifies the comment in `llama-context.cpp` regarding the
prefill prompt (pp), and token generation (tg) graphs.

The motivation for this is that I've struggled to remember these and had
to look them up more than once, so I thought it would be helpful to add
a comment that makes it clear what these stand for.

* squash! llama : clarify comment about pp and tg graphs [no ci]

Change "pp" to "prompt processing".
2025-07-27 12:10:51 +02:00
..
CMakeLists.txt memory : Hybrid recurrent cache (#13979) 2025-06-19 08:08:14 +03:00
llama-adapter.cpp llama : do not crash if there is no CPU backend (#13395) 2025-05-09 13:02:07 +02:00
llama-adapter.h llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
llama-arch.cpp chat : fix kimi-k2 chat template (#14852) 2025-07-24 13:59:56 +02:00
llama-arch.h model : add EXAONE 4.0 support (#14630) 2025-07-18 10:45:49 +02:00
llama-batch.cpp llama : reuse compute graphs (#14482) 2025-07-17 19:08:33 +03:00
llama-batch.h llama : reuse compute graphs (#14482) 2025-07-17 19:08:33 +03:00
llama-chat.cpp chat : fix kimi-k2 chat template (#14852) 2025-07-24 13:59:56 +02:00
llama-chat.h model : add EXAONE 4.0 support (#14630) 2025-07-18 10:45:49 +02:00
llama-context.cpp llama : clarify comment about pp and tg graphs [no ci] (#14895) 2025-07-27 12:10:51 +02:00
llama-context.h context : restore preemptive sched reset when LLAMA_SET_ROWS=0 (#14870) 2025-07-25 14:28:06 +03:00
llama-cparams.cpp cparams : rename LLAMA_MAX_PARALLEL_SEQUENCES to LLAMA_MAX_SEQ (#14188) 2025-06-15 10:08:58 +03:00
llama-cparams.h llama : add high-throughput mode (#14363) 2025-07-16 16:35:42 +03:00
llama-grammar.cpp `server`: streaming of tool calls and thoughts when `--jinja` is on (#12379) 2025-05-25 01:48:08 +01:00
llama-grammar.h `tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034) 2025-03-05 13:05:13 +00:00
llama-graph.cpp metal : fuse add, mul + add tests (#14596) 2025-07-18 20:37:26 +03:00
llama-graph.h graph : refactor context to not pass gf explicitly (#14629) 2025-07-18 08:29:28 +03:00
llama-hparams.cpp llama : add high-throughput mode (#14363) 2025-07-16 16:35:42 +03:00
llama-hparams.h model : make rope_yarn_log_mul optional for deepseek2 (#14896) 2025-07-27 11:18:37 +03:00
llama-impl.cpp GGUF: C++ refactor, backend support, misc fixes (#11030) 2025-01-07 18:01:58 +01:00
llama-impl.h cleanup: fix compile warnings associated with gnu_printf (#11811) 2025-02-12 10:06:53 -04:00
llama-io.cpp llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
llama-io.h llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
llama-kv-cache-unified-iswa.cpp llama : add high-throughput mode (#14363) 2025-07-16 16:35:42 +03:00
llama-kv-cache-unified-iswa.h llama : add high-throughput mode (#14363) 2025-07-16 16:35:42 +03:00
llama-kv-cache-unified.cpp kv-cache : fix k-shift for multiple streams (#14742) 2025-07-17 20:52:33 +03:00
llama-kv-cache-unified.h llama : reuse compute graphs (#14482) 2025-07-17 19:08:33 +03:00
llama-kv-cells.h kv-cache : use ggml_set_rows (#14285) 2025-07-03 10:53:35 +03:00
llama-memory-hybrid.cpp llama : fix parameter order for hybrid memory initialization (#14725) 2025-07-16 21:17:25 +02:00
llama-memory-hybrid.h kv-cache : use ggml_set_rows (#14285) 2025-07-03 10:53:35 +03:00
llama-memory-recurrent.cpp memory : handle saving/loading null layers in recurrent memory (#14675) 2025-07-23 11:16:41 +03:00
llama-memory-recurrent.h memory : rename interface to llama_memory_context_i (#14296) 2025-06-21 08:03:46 +03:00
llama-memory.cpp memory : correctly handle failure in apply() (#14438) 2025-06-30 18:03:03 +03:00
llama-memory.h memory : correctly handle failure in apply() (#14438) 2025-06-30 18:03:03 +03:00
llama-mmap.cpp llama : allow using mmap without PrefetchVirtualMemory, apply GGML_WIN_VER to llama.cpp sources (#14013) 2025-06-05 11:57:42 +02:00
llama-mmap.h llama-mmap: fix missing include (#11796) 2025-02-10 20:58:18 +02:00
llama-model-loader.cpp llama : support multiple classifier outputs and labels (#13940) 2025-06-06 09:03:25 +02:00
llama-model-loader.h llama : add option to override model tensor buffers (#11397) 2025-04-02 14:52:01 +02:00
llama-model-saver.cpp llama : improve sep token handling (#14272) 2025-06-20 14:04:09 +02:00
llama-model-saver.h llama/ggml: add LLM training support (#10544) 2025-05-12 14:44:49 +02:00
llama-model.cpp model : make rope_yarn_log_mul optional for deepseek2 (#14896) 2025-07-27 11:18:37 +03:00
llama-model.h model: add Ernie 4.5 MoE support (#14658) 2025-07-17 23:15:32 +02:00
llama-quant.cpp quantize : fix minor logic flaw in --tensor-type (#14572) 2025-07-13 18:02:17 +02:00
llama-quant.h llama : refactor `src/llama.cpp` (#10902) 2025-01-03 10:18:53 +02:00
llama-sampling.cpp sampling : make sure samplers return at least 1 token (#13822) 2025-05-27 12:07:52 +03:00
llama-sampling.h llama : add `llama_vocab`, functions -> methods, naming (#11110) 2025-01-12 11:32:42 +02:00
llama-vocab.cpp model : add EXAONE 4.0 support (#14630) 2025-07-18 10:45:49 +02:00
llama-vocab.h Support diffusion models: Add Dream 7B (#14644) 2025-07-16 20:03:51 +08:00
llama.cpp llama : add thread safety test (#14035) 2025-06-16 08:11:43 -07:00
unicode-data.cpp server : better security control for public deployments (#9776) 2024-10-08 13:27:04 +02:00
unicode-data.h llama : reduce compile time and binary size (#9712) 2024-10-02 15:49:55 +02:00
unicode.cpp model : add Kimi-K2 support (#14654) 2025-07-15 21:54:22 +02:00
unicode.h model : add Kimi-K2 support (#14654) 2025-07-15 21:54:22 +02:00