llama.cpp

Commit Graph

Author	SHA1	Message	Date
ddh0	85b6e52e39	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-15 21:23:25 -06:00
ddh0	1c2d2e900d	simplify target computation last commit with debug logging!	2025-12-15 21:02:11 -06:00
Sigbjørn Skjæret	d6a1e18c65	convert : move rope_parameters to TextModel class (#18061 ) * make sure to search text_config for rope parameters * move rope_parameters to TextModel class	2025-12-15 22:03:16 +01:00
Shouyu	c45f89d551	ggml-hexagon: mm for mtmd (#17894 ) * feat: add run_mtmd script for hexagon * fix: fix issue in fp16xfp32 mm * fix: remove opt_experiment for fp16xfp32 mm * fix: ggml-hexagon: matmul fp16xfp32 support non-contigious src0 * fix: fix syntax check for run-mtmd.sh for cli	2025-12-15 10:53:56 -08:00
HelloKS	9d52f17ae3	model : add KORMo model (#18032 ) * vocab: add KORMo Tokenizer * model: add KORMoForCausalLM * vocab: change pretokenizer to qwen2 * lint: fix unintended line removal * model: make qwen2 bias tensor optional * model: use qwen2 architecture for KORMo	2025-12-15 18:51:43 +01:00
ssweens	4529c660c8	kv-cache: Fix state restore fragmented cache (#17982 ) * kv-cache : fix state restore with fragmented cache (#17527) Change find_slot to allow non-contiguous allocation during state restore. Fixes 'failed to find available cells in kv cache' error when restoring state to fragmented cache. * tests : update logic * cleanup: tightened state_read_meta sig, added is_contiguous case * fix: state_read_meta arg reorder loose ends --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-12-15 19:28:35 +02:00
ddh0	0344068cf1	remove extraneous logging	2025-12-15 09:35:44 -06:00
Pascal	0f4f35e7be	Fix unreadable user markdown colors and truncate long texts in deletion dialogs (#17555 ) * webui: limit conversation name length in dialogs * webui: fix unreadable colors on links and table cell hover in user markdown * webui: keep table borders visible in user markdown * webui: updating unified exports * Update tools/server/webui/src/lib/components/app/chat/ChatAttachments/ChatAttachmentThumbnailFile.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * chore: update webui build output * chore: update webui build output * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>	2025-12-15 16:34:53 +01:00
ddh0	9c50b573f5	improve logging messages in llama_sampler_power_law	2025-12-15 09:25:05 -06:00
ddh0	6e66095e1f	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-15 09:07:13 -06:00
Jeremy Demeule	165caaf5fb	metal: use shared buffers on eGPU (#17866 ) * metal: use shared buffers on eGPU With #15906, I noticed on important regression when using metal backend on eGPU. This commit restore the previous behavior and add an option to force its activation. * metal: use shared buffers on eGPU * metal: use shared buffers on eGPU	2025-12-15 16:14:49 +02:00
Xuan-Son Nguyen	96a181a933	mtmd: refactor audio preprocessing (#17978 ) * mtmd: refactor audio preprocessing * refactor Co-authored-by: Tarek <tdakhran@users.noreply.github.com> * wip * wip (2) * improve constructor * fix use_natural_log * fix padding for short input * clean up * remove need_chunking --------- Co-authored-by: Tarek <tdakhran@users.noreply.github.com>	2025-12-15 14:16:52 +01:00
Andrew Aladjev	4a4f7e6550	cli: fixed dead links to tools/main for cli and completion, fixed code owners (#17993 ) Co-authored-by: Andrew Aladjev <andrew.aladjev@gmail.com>	2025-12-15 11:47:04 +01:00
Thomas Jarosch	e73d548659	webui: add "delete all conversations" button to import/export tab (#17444 ) * webui: add "delete all conversations" button to import/export tab - Add 'Delete all conversations' functionality with confirmation dialog - Add Trash icon and destructive styling for clear visual indication - Redirects to "?new_chat=true#/" by using conversationsStore.deleteAll() * chore: update webui build output	2025-12-15 11:29:29 +01:00
Johannes Gäßler	b1f3a6e5db	llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653 ) * llama: automatically fit args to free memory llama-fit-params tool * fix CI * hints for bug reports, ensure no reallocation * fix segfault with Vulkan * add llama-fit-params to CI * fix CI * fix CI * fix CI * minor adjustments * fix assignment of 1 dense layer * fix logger not being reset on model load failure * remove --n-gpu-layer hint on model load failure * fix llama-fit-params verbosity * fix edge case * fix typo [no ci]	2025-12-15 09:24:59 +01:00
ddh0	4e04bd1ce2	log sampler init values	2025-12-14 23:14:51 -06:00
ddh0	1c58e9a96a	add power law to the new `samplers` vector	2025-12-14 22:32:27 -06:00
ddh0	4e28eb2ffe	format (double)	2025-12-14 22:11:34 -06:00
ddh0	b5ed673ce9	fix logging	2025-12-14 22:08:36 -06:00
ddh0	68543257e9	update default decay to 0.9	2025-12-14 22:03:17 -06:00
ddh0	493bf301ff	silence `missing initializer for member`	2025-12-14 21:55:45 -06:00
ddh0	f5d08724e7	fix bad merge my git skills are lacking	2025-12-14 21:51:59 -06:00
Neo Zhang Jianyu	4aced7a631	[SYCL] Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (#17826 ) * support gpt-oss GPU by OP add-id, mul_mat for mxfp4, swiglu_oai, fix warning * fix fault ut case, update ops.md * rebase, fix format issue	2025-12-15 10:35:15 +08:00
piDack	745fa0e78b	model : add glm-asr support (#17901 ) * [model] add glm-asr support * fix format for ci * fix convert format for ci * update glm_asr convert script & use build_ffn for glm_asr clip & use build_stack for padding and review * check root architecture for convert hf script * fix conficlt with upstream * fix convert script for glm asr & format clip-impl * format * restore hparams text * improved conversion --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2025-12-15 03:18:46 +01:00
ddh0	6934780669	optimize	2025-12-14 16:26:15 -06:00
ddh0	36b526d768	Merge branch 'master' into power-law-sampler	2025-12-14 15:43:49 -06:00
Xuan-Son Nguyen	52392291b2	preset: handle negated arg, reverse the meaning if needed (#18041 )	2025-12-14 22:08:10 +01:00
Sigbjørn Skjæret	5c8a717128	convert : refactor rope scaling handling (#18013 ) * refactor rope scaling handling * ws-- * missed a couple * use find_hparam	2025-12-14 16:04:37 +01:00
Haowei Wu	37f5a1093b	mtmd: enhance image resizing in llava_uhd (#18014 )	2025-12-14 15:57:52 +01:00
Ruben Ortlam	9e6649ecf2	vulkan: fix mul_mat_vec_iq1_s formatting (#18026 )	2025-12-14 14:52:46 +01:00
Xuan-Son Nguyen	0759b09c90	graph: add f_attn_temp_offset (#18025 )	2025-12-14 13:05:59 +01:00
ddh0	667b70fdac	update default decay	2025-12-14 03:41:28 -06:00
ddh0	ec54fe5f14	no, but does this?	2025-12-14 02:54:14 -06:00
Georgi Gerganov	254098a279	common : refactor common_sampler + grammar logic changes (#17937 ) * common : refactor common_sampler + grammar logic changes * tests : increase max_tokens to get needed response * batched : fix uninitialized samplers	2025-12-14 10:11:13 +02:00
Jeff Bolz	3238b1400c	vulkan: Fix data race/hang in scalar/cm1 flash attention (#17887 )	2025-12-14 09:00:00 +01:00
ddh0	2a3f579d1f	does this fix it?	2025-12-14 01:55:02 -06:00
lovedheart	4722671641	vulkan: improve mul_mat_vec_iq1_s speed (#17874 )	2025-12-14 08:47:49 +01:00
Eve	d15d177f43	vulkan: faster q6_k matmul (#17813 ) * q6_k faster mul mat * 8 values * fix comment * switch to two at a time * start ci for .glsl files	2025-12-14 08:29:37 +01:00
Georgi Gerganov	77ad8542bd	model-conversion : cast logits to float32 (#18009 )	2025-12-14 08:58:13 +02:00
ddh0	9613c48172	with logging	2025-12-14 00:36:59 -06:00
Georgi Gerganov	609a2d0268	models : fix YaRN regression + consolidate logic (#18006 ) * models : fix YaRN regression + consolidate logic * cont : fix the fix * cont : remove header * cont : add header	2025-12-14 08:34:56 +02:00
Georgi Gerganov	a63cbafbbc	ggml : arm repack fix build	2025-12-14 08:33:51 +02:00
Georgi Gerganov	0e59224990	sync : ggml	2025-12-14 08:33:51 +02:00
Georgi Gerganov	71fdcf0616	ggml : arm repack fix build (whisper/0)	2025-12-14 08:33:51 +02:00
Congcong Cai	615655aafe	cmake : set `CMAKE_RUNTIME_OUTPUT_DIRECTORY` for non standalone build (ggml/1394) Some backend depends on CMAKE_RUNTIME_OUTPUT_DIRECTORY to create temporary file like metal backened. Missing CMAKE_RUNTIME_OUTPUT_DIRECTORY will cause some cmake error like permission denied (try to copy file to root). This PR wants to setup a default path for CMAKE_RUNTIME_OUTPUT_DIRECTORY when it does not exist.	2025-12-14 08:33:51 +02:00
ddh0	d1e5c60442	add missing values to `common_params_sampling::print()`	2025-12-13 23:26:03 -06:00
ddh0	965bcc9dc4	fix leftover `window_size`	2025-12-13 22:19:15 -06:00
ddh0	b8a9626a73	oops forgot args.cpp	2025-12-13 22:17:08 -06:00
ddh0	a96ddd743a	re-write + change parameters + simplify	2025-12-13 22:15:03 -06:00
ddh0	67a733670e	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-13 17:27:35 -06:00

1 2 3 4 5 ...

7459 Commits All Branches Search

7459 Commits

All Branches