ddh0
0344068cf1
remove extraneous logging
2025-12-15 09:35:44 -06:00
ddh0
9c50b573f5
improve logging messages in llama_sampler_power_law
2025-12-15 09:25:05 -06:00
ddh0
6e66095e1f
Merge branch 'ggml-org:master' into power-law-sampler
2025-12-15 09:07:13 -06:00
Jeremy Demeule
165caaf5fb
metal: use shared buffers on eGPU ( #17866 )
...
* metal: use shared buffers on eGPU
With #15906 , I noticed on important regression when using metal backend on eGPU.
This commit restore the previous behavior and add an option to force its activation.
* metal: use shared buffers on eGPU
* metal: use shared buffers on eGPU
2025-12-15 16:14:49 +02:00
Xuan-Son Nguyen
96a181a933
mtmd: refactor audio preprocessing ( #17978 )
...
* mtmd: refactor audio preprocessing
* refactor
Co-authored-by: Tarek <tdakhran@users.noreply.github.com>
* wip
* wip (2)
* improve constructor
* fix use_natural_log
* fix padding for short input
* clean up
* remove need_chunking
---------
Co-authored-by: Tarek <tdakhran@users.noreply.github.com>
2025-12-15 14:16:52 +01:00
Andrew Aladjev
4a4f7e6550
cli: fixed dead links to tools/main for cli and completion, fixed code owners ( #17993 )
...
Co-authored-by: Andrew Aladjev <andrew.aladjev@gmail.com>
2025-12-15 11:47:04 +01:00
Thomas Jarosch
e73d548659
webui: add "delete all conversations" button to import/export tab ( #17444 )
...
* webui: add "delete all conversations" button to import/export tab
- Add 'Delete all conversations' functionality with confirmation dialog
- Add Trash icon and destructive styling for clear visual indication
- Redirects to "?new_chat=true#/" by using conversationsStore.deleteAll()
* chore: update webui build output
2025-12-15 11:29:29 +01:00
Johannes Gäßler
b1f3a6e5db
llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization ( #16653 )
...
* llama: automatically fit args to free memory
llama-fit-params tool
* fix CI
* hints for bug reports, ensure no reallocation
* fix segfault with Vulkan
* add llama-fit-params to CI
* fix CI
* fix CI
* fix CI
* minor adjustments
* fix assignment of 1 dense layer
* fix logger not being reset on model load failure
* remove --n-gpu-layer hint on model load failure
* fix llama-fit-params verbosity
* fix edge case
* fix typo [no ci]
2025-12-15 09:24:59 +01:00
ddh0
4e04bd1ce2
log sampler init values
2025-12-14 23:14:51 -06:00
ddh0
1c58e9a96a
add power law to the new `samplers` vector
2025-12-14 22:32:27 -06:00
ddh0
4e28eb2ffe
format (double)
2025-12-14 22:11:34 -06:00
ddh0
b5ed673ce9
fix logging
2025-12-14 22:08:36 -06:00
ddh0
68543257e9
update default decay to 0.9
2025-12-14 22:03:17 -06:00
ddh0
493bf301ff
silence `missing initializer for member`
2025-12-14 21:55:45 -06:00
ddh0
f5d08724e7
fix bad merge
...
my git skills are lacking
2025-12-14 21:51:59 -06:00
Neo Zhang Jianyu
4aced7a631
[SYCL] Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai ( #17826 )
...
* support gpt-oss GPU by OP add-id, mul_mat for mxfp4, swiglu_oai, fix warning
* fix fault ut case, update ops.md
* rebase, fix format issue
2025-12-15 10:35:15 +08:00
piDack
745fa0e78b
model : add glm-asr support ( #17901 )
...
* [model] add glm-asr support
* fix format for ci
* fix convert format for ci
* update glm_asr convert script & use build_ffn for glm_asr clip & use build_stack for padding and review
* check root architecture for convert hf script
* fix conficlt with upstream
* fix convert script for glm asr & format clip-impl
* format
* restore hparams text
* improved conversion
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-12-15 03:18:46 +01:00
ddh0
6934780669
optimize
2025-12-14 16:26:15 -06:00
ddh0
36b526d768
Merge branch 'master' into power-law-sampler
2025-12-14 15:43:49 -06:00
Xuan-Son Nguyen
52392291b2
preset: handle negated arg, reverse the meaning if needed ( #18041 )
2025-12-14 22:08:10 +01:00
Sigbjørn Skjæret
5c8a717128
convert : refactor rope scaling handling ( #18013 )
...
* refactor rope scaling handling
* ws--
* missed a couple
* use find_hparam
2025-12-14 16:04:37 +01:00
Haowei Wu
37f5a1093b
mtmd: enhance image resizing in llava_uhd ( #18014 )
2025-12-14 15:57:52 +01:00
Ruben Ortlam
9e6649ecf2
vulkan: fix mul_mat_vec_iq1_s formatting ( #18026 )
2025-12-14 14:52:46 +01:00
Xuan-Son Nguyen
0759b09c90
graph: add f_attn_temp_offset ( #18025 )
2025-12-14 13:05:59 +01:00
ddh0
667b70fdac
update default decay
2025-12-14 03:41:28 -06:00
ddh0
ec54fe5f14
no, but does this?
2025-12-14 02:54:14 -06:00
Georgi Gerganov
254098a279
common : refactor common_sampler + grammar logic changes ( #17937 )
...
* common : refactor common_sampler + grammar logic changes
* tests : increase max_tokens to get needed response
* batched : fix uninitialized samplers
2025-12-14 10:11:13 +02:00
Jeff Bolz
3238b1400c
vulkan: Fix data race/hang in scalar/cm1 flash attention ( #17887 )
2025-12-14 09:00:00 +01:00
ddh0
2a3f579d1f
does this fix it?
2025-12-14 01:55:02 -06:00
lovedheart
4722671641
vulkan: improve mul_mat_vec_iq1_s speed ( #17874 )
2025-12-14 08:47:49 +01:00
Eve
d15d177f43
vulkan: faster q6_k matmul ( #17813 )
...
* q6_k faster mul mat
* 8 values
* fix comment
* switch to two at a time
* start ci for .glsl files
2025-12-14 08:29:37 +01:00
Georgi Gerganov
77ad8542bd
model-conversion : cast logits to float32 ( #18009 )
2025-12-14 08:58:13 +02:00
ddh0
9613c48172
with logging
2025-12-14 00:36:59 -06:00
Georgi Gerganov
609a2d0268
models : fix YaRN regression + consolidate logic ( #18006 )
...
* models : fix YaRN regression + consolidate logic
* cont : fix the fix
* cont : remove header
* cont : add header
2025-12-14 08:34:56 +02:00
Georgi Gerganov
a63cbafbbc
ggml : arm repack fix build
2025-12-14 08:33:51 +02:00
Georgi Gerganov
0e59224990
sync : ggml
2025-12-14 08:33:51 +02:00
Georgi Gerganov
71fdcf0616
ggml : arm repack fix build (whisper/0)
2025-12-14 08:33:51 +02:00
Congcong Cai
615655aafe
cmake : set `CMAKE_RUNTIME_OUTPUT_DIRECTORY` for non standalone build (ggml/1394)
...
Some backend depends on CMAKE_RUNTIME_OUTPUT_DIRECTORY to create temporary file like metal backened.
Missing CMAKE_RUNTIME_OUTPUT_DIRECTORY will cause some cmake error like permission denied (try to copy file to root).
This PR wants to setup a default path for CMAKE_RUNTIME_OUTPUT_DIRECTORY when it does not exist.
2025-12-14 08:33:51 +02:00
ddh0
d1e5c60442
add missing values to `common_params_sampling::print()`
2025-12-13 23:26:03 -06:00
ddh0
965bcc9dc4
fix leftover `window_size`
2025-12-13 22:19:15 -06:00
ddh0
b8a9626a73
oops forgot args.cpp
2025-12-13 22:17:08 -06:00
ddh0
a96ddd743a
re-write + change parameters + simplify
2025-12-13 22:15:03 -06:00
ddh0
67a733670e
Merge branch 'ggml-org:master' into power-law-sampler
2025-12-13 17:27:35 -06:00
Xuan-Son Nguyen
c00ff929dc
scripts: add script to compare logprobs of llama.cpp against other frameworks ( #17947 )
...
* scripts: add script to compare logits of llama.cpp against other frameworks
* accept custom prompt file
* fix code style
* clarify endpoint
* fix displaying
* use abs for diff
* fix vllm case
* rm output file
* rename to compare-logprobs
* add "pattern"
2025-12-13 22:33:29 +01:00
Sergey Fedorov
4ed2bae50d
server-models.cpp: add missing <filesystem> ( #18000 )
...
Fixes: https://github.com/ggml-org/llama.cpp/issues/17999
2025-12-13 22:02:43 +01:00
Jeff Bolz
5266379bca
llama_context: synchronize before reallocating output buffer ( #17974 )
2025-12-13 09:19:51 -06:00
Xuan-Son Nguyen
4d5ae24c0a
arg: fix common_params_parse not accepting negated arg ( #17991 )
2025-12-13 12:53:37 +01:00
Gustavo Rocha Dias
66ba51252e
cmake: correct scope - link ws2_32 for MinGW/w64devkit builds in cpp-httplib ( #17972 )
...
* fix - w64devkit build
* fix - w64devkit build private scope
2025-12-13 12:46:36 +01:00
Jeff Bolz
36255a2268
vulkan: support get_rows for i32 ( #17941 )
2025-12-13 10:12:53 +01:00
Jeff Bolz
3229a23fa6
vulkan: support GGML_OP_DIAG ( #17893 )
2025-12-13 10:07:49 +01:00