llama.cpp/tools
ddh0 13f1e4a9ca
llama : add adaptive-p sampler (#17927)
* initial commit for branch

* simplify constants

* add params to `struct common_params_sampling`, add reference to PR

* explicitly clamp `min_target` and `max_target` to `[0.0, 1.0]`

* add args, rename `queue_size` -> `window_size`

* improved comments

* minor

* remove old unused code from algorithm

* minor

* add power law case to `common_sampler_init`, add sampler name mappings

* clarify behaviour when `window_size = 0`

* add missing enums

* remove `target_range` param, make `target == 1` no-op, cleanup code

* oops, straggler

* add missing parameters in `server-task.cpp`

* copy from author

ref:
https://gist.github.com/MrJackSpade/9be99c7efbba7b95a41377e123b7b069

* remove old debug log, style nit

* fix compiler warning, add commented-out logging per token

* re-write + change parameters + simplify

* oops forgot args.cpp

* fix leftover `window_size`

* add missing values to `common_params_sampling::print()`

* with logging

* does this fix it?

* no, but does this?

* update default decay

* optimize

* fix bad merge

my git skills are lacking

* silence `missing initializer for member`

* update default decay to 0.9

* fix logging

* format (double)

* add power law to the new `samplers` vector

* log sampler init values

* improve logging messages in llama_sampler_power_law

* remove extraneous logging

* simplify target computation

last commit with debug logging!

* remove debug logging, explicitly clamp params at init

* add `use_power_law` flag + logic, minor cleanup

* update `power-law` -> `adaptive-p`

* fix cold start EMA

- `ctx->weighted_sum` is now initialized and reset to `target / (1.0f -
clamped_decay)`
- `ctx->total_weight` is now initialized and reset to `1.0f / (1.0f -
clamped_decay)`

this fixes a "cold start" problem with the moving average

* update `SHARPNESS` constant to `10.0f`

* minor style fixes

no functional changes

* minor style fixes cont.

* update `llama_sampler_adaptive_p_i` for backend sampling (ref: #17004)

* separate into `apply` + `accept` functions

* `pending_token_idx`: switch from `llama_token` to `int32`

functionally identical (`llama.h` has `typedef int32_t llama_token;`),
but its more correct now

* don't transform logits <= -1e9f

* fix masking in backend top-p, min-p

* address review comments

* typo in comments `RND` -> `RNG`

* add docs

* add recommended values in completion docs

* address PR feedback

* remove trailing whitespace (for CI `editorconfig`)

* add to adaptive-p to `common_sampler_types_from_chars`
2026-01-15 19:16:29 +02:00
..
batched-bench tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
cli llama : add adaptive-p sampler (#17927) 2026-01-15 19:16:29 +02:00
completion llama : add adaptive-p sampler (#17927) 2026-01-15 19:16:29 +02:00
cvector-generator common : refactor common_sampler + grammar logic changes (#17937) 2025-12-14 10:11:13 +02:00
export-lora cmake : Do not install tools on iOS targets (#15903) 2025-09-16 09:54:44 +07:00
fit-params llama-fit-params: free memory target per device (#18679) 2026-01-08 10:07:58 +01:00
gguf-split cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00
imatrix common : refactor common_sampler + grammar logic changes (#17937) 2025-12-14 10:11:13 +02:00
llama-bench llama-bench: add direct_io parameter (#18778) 2026-01-13 08:49:10 +01:00
mtmd Restore clip's cb() to its rightful glory - extract common debugging elements in llama (#17914) 2026-01-14 20:29:35 +01:00
perplexity common : refactor common_sampler + grammar logic changes (#17937) 2025-12-14 10:11:13 +02:00
quantize quantize: prevent input/output file collision (#18451) 2025-12-31 23:29:03 +08:00
rpc Install rpc-server when GGML_RPC is ON. (#17149) 2025-11-11 10:53:59 +00:00
server llama : add adaptive-p sampler (#17927) 2026-01-15 19:16:29 +02:00
tokenize cmake : Do not install tools on iOS targets (#15903) 2025-09-16 09:54:44 +07:00
tts refactor : remove libcurl, use OpenSSL when available (#18828) 2026-01-14 18:02:47 +01:00
CMakeLists.txt cmake: only build cli when server is enabled (#18670) 2026-01-09 16:43:26 +01:00