llama.cpp

Commit Graph

Author	SHA1	Message	Date
Saba Fallah	4d91711e5c	fixed merge build issue	2025-12-19 11:14:36 +01:00
Saba Fallah	9a05e1d116	Merge branch 'master' into sf/deepseek-ocr	2025-12-19 11:08:29 +01:00
Xuan-Son Nguyen	8ea958d4d9	model : add ASR support for LFM2-Audio-1.5B (conformer) (#18106 ) * ASR with LFM2-Audio-1.5B * Set rope_theta * Fix comment * Remove rope_theta setting * Address PR feedback * rename functions to conformer * remove some redundant ggml_cont * fix missing tensor * add prefix "a." for conv tensors * remove redundant reshape * clean up * add test model --------- Co-authored-by: Tarek Dakhran <tarek@liquid.ai>	2025-12-19 00:18:01 +01:00
Pascal	f9ec8858ed	webui: display prompt processing stats (#18146 ) * webui: display prompt processing stats * feat: Improve UI of Chat Message Statistics * chore: update webui build output * refactor: Post-review improvements * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>	2025-12-18 17:55:03 +01:00
Aleksander Grygier	9ce64aed7d	webui: Fix selecting generated output issues during active streaming (#18091 ) * draft: incremental markdown rendering with stable blocks * refactor: Logic improvements * refactor: DRY Markdown post-processing logic * refactor: ID generation improvements * fix: Remove runes * refactor: Clean up & add JSDocs * chore: update webui static output * fix: Add tick to prevent race conditions for rendering Markdown blocks Suggestion from @ServeurpersoCom Co-authored-by: Pascal <admin@serveurperso.com> * chore: Run `npm audit fix` * chore: update webui static output * feat: Improve performance using global counter & id instead of UUID * refactor: Enhance Markdown rendering with link and code features * chore: update webui static output * fix: Code block content extraction * chore: update webui static output * chore: update webui static output --------- Co-authored-by: Pascal <admin@serveurperso.com>	2025-12-18 11:13:52 +01:00
Kim S.	900316da4e	webui: fix chat screen shadow width (#18010 ) * webui: fix chat screen shadow width * chore: add index.html.gz	2025-12-18 11:08:42 +01:00
Pascal	6ce3d85796	server: (webui) add --webui-config (#18028 ) * server/webui: add server-side WebUI config support Add CLI arguments --webui-config (inline JSON) and --webui-config-file (file path) to configure WebUI default settings from server side. Backend changes: - Parse JSON once in server_context::load_model() for performance - Cache parsed config in webui_settings member (zero overhead on /props) - Add proper error handling in router mode with try/catch - Expose webui_settings in /props endpoint for both router and child modes Frontend changes: - Add 14 configurable WebUI settings via parameter sync - Add tests for webui settings extraction - Fix subpath support with base path in API calls Addresses feedback from @ngxson and @ggerganov * server: address review feedback from ngxson * server: regenerate README with llama-gen-docs	2025-12-17 21:45:45 +01:00
Xuan-Son Nguyen	e85e9d7637	server: (router) disable SSL on child process (#18141 )	2025-12-17 21:39:08 +01:00
Kim S.	d37fc93505	webui: fix chat header width when sidebar is closed (#17981 ) * webui: fix chat header width when sidebar is closed * chore: add index.html.gz	2025-12-17 20:05:45 +01:00
HonestQiao	15dd67d869	model: fix GLM-ASR-Nano-2512 load error (#18130 ) (#18142 )	2025-12-17 16:34:35 +01:00
Xuan-Son Nguyen	bde461de8c	server: (router) allow child process to report status via stdout (#18110 ) * server: (router) allow child process to report status via stdout * apply suggestions	2025-12-17 14:54:11 +01:00
bluebread	5a741fda55	mtmd: format code	2025-12-17 03:26:38 +00:00
Johannes Gäßler	4164596c76	llama-fit-params: QoL impr. for prints/errors (#18089 )	2025-12-17 00:03:19 +01:00
Saba Fallah	87e4a00c4c	minor - added GLM-4.6V to big tests - added missing deps for python test	2025-12-16 17:28:46 +01:00
Saba Fallah	00d235700d	Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr # Conflicts: # src/llama-arch.cpp	2025-12-16 16:45:43 +01:00
Saba Fallah	512b2c8fe4	merge with changes from https://github.com/ggml-org/llama.cpp/pull/18042	2025-12-16 14:07:04 +01:00
yifant-code	59977eba7b	server: fix crash when batch > ubatch with embeddings (#17912 ) * server: fix crash when batch > ubatch with embeddings (#12836) Fixes #12836 where the server crashes with GGML_ASSERT failure when running with embeddings enabled and n_batch > n_ubatch. Root cause: Embeddings use non-causal attention which requires all tokens to be processed within a single ubatch. When n_batch > n_ubatch, the server attempts to split processing, causing assertion failure. Solution: - Add parameter validation in main() after common_params_parse() - When embeddings enabled and n_batch > n_ubatch: * Log warnings explaining the issue * Automatically set n_batch = n_ubatch * Prevent server crash This follows the approach suggested by @ggerganov in issue #12836. Note: This supersedes stalled PR #12940 which attempted a runtime fix in the old examples/server/server.cpp location. This implementation validates at startup in tools/server/server.cpp (current location). Testing: - Build: Compiles successfully - Validation triggers: Warns when -b > -ub with --embedding - Auto-correction works: Adjusts n_batch = n_ubatch - No false positives: Valid params don't trigger warnings - Verified on macOS M3 Pro with embedding model * Update tools/server/server.cpp --------- Co-authored-by: ytian218 <ytian218@bloomberg.net> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-12-16 14:27:36 +02:00
Saba Fallah	51c3de6887	Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr # Conflicts: # gguf-py/gguf/constants.py # gguf-py/gguf/tensor_mapping.py # tools/mtmd/clip-impl.h # tools/mtmd/clip.cpp # tools/mtmd/models/models.h	2025-12-16 12:16:25 +01:00
Xuan-Son Nguyen	7b1db3d3b7	arg: clarify auto kvu/np being set on server (#17997 ) * arg: clarify auto kvu/np being set on server * improve docs * use invalid_argument	2025-12-16 12:01:27 +01:00
2114L3	5f5f9b4637	server: Update README.md incorrect argument (#18073 ) n-gpu-layer is incorrect argument is n-gpu-layers with the 's'	2025-12-16 11:50:43 +01:00
Xuan-Son Nguyen	3d86c6c2b5	model: support GLM4V vision encoder (#18042 ) * convert ok * no deepstack * less new tensors * cgraph ok * add mrope for text model * faster patch merger * add GGML_ROPE_TYPE_MRNORM * add support for metal * move glm4v do dedicated graph * convert: add norm_embd * clip: add debugging fn * working correctly * fix style * use bicubic * fix mrope metal * improve cpu * convert to neox ordering on conversion * revert backend changes * force stop if using old weight * support moe variant * fix conversion * fix convert (2) * Update tools/mtmd/clip-graph.h Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * process mrope_section on TextModel base class * resolve conflict merge --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-12-16 11:25:26 +01:00
Saba Fallah	4a4f82968c	Merge branch 'ggml-org:master' into sf/deepseek-ocr	2025-12-16 09:09:52 +01:00
Aleksander Grygier	3034836d36	webui: Improve copy to clipboard with text attachments (#17969 ) * feat: Create copy/paste user message including "pasted text" attachments * chore: update webui build output * chore: update webui static output * fix: UI issues * chore: update webui static output * fix: Decode HTML entities using `DOMParser` * chore: update webui build output * chore: update webui static output	2025-12-16 07:38:46 +01:00
Aleksander Grygier	a20979d433	webui: Add setting to always show sidebar on Desktop (#17809 ) * feat: Add setting to always show Sidebar on Desktop * chore: update webui build output * feat: Add auto-show sidebar setting * fix: Mobile settings dialog UI * chore: update webui build output * feat: UI label update * chore: update webui build output * chore: update webui build output * chore: update webui build output * refactor: Cleanup * chore: update webui build output	2025-12-16 07:31:37 +01:00
Darius Lukas	40d9c394f4	Webui: Disable attachment button and model selector button when prompt textbox is disabled. (#17925 ) * Pass disabled state to the file attachments button and the model selector button. * Update index.html.gz * Fix model info card in non-router mode. * Update index.html.gz	2025-12-16 07:15:49 +01:00
Pascal	0f4f35e7be	Fix unreadable user markdown colors and truncate long texts in deletion dialogs (#17555 ) * webui: limit conversation name length in dialogs * webui: fix unreadable colors on links and table cell hover in user markdown * webui: keep table borders visible in user markdown * webui: updating unified exports * Update tools/server/webui/src/lib/components/app/chat/ChatAttachments/ChatAttachmentThumbnailFile.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * chore: update webui build output * chore: update webui build output * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>	2025-12-15 16:34:53 +01:00
Xuan-Son Nguyen	96a181a933	mtmd: refactor audio preprocessing (#17978 ) * mtmd: refactor audio preprocessing * refactor Co-authored-by: Tarek <tdakhran@users.noreply.github.com> * wip * wip (2) * improve constructor * fix use_natural_log * fix padding for short input * clean up * remove need_chunking --------- Co-authored-by: Tarek <tdakhran@users.noreply.github.com>	2025-12-15 14:16:52 +01:00
Andrew Aladjev	4a4f7e6550	cli: fixed dead links to tools/main for cli and completion, fixed code owners (#17993 ) Co-authored-by: Andrew Aladjev <andrew.aladjev@gmail.com>	2025-12-15 11:47:04 +01:00
Thomas Jarosch	e73d548659	webui: add "delete all conversations" button to import/export tab (#17444 ) * webui: add "delete all conversations" button to import/export tab - Add 'Delete all conversations' functionality with confirmation dialog - Add Trash icon and destructive styling for clear visual indication - Redirects to "?new_chat=true#/" by using conversationsStore.deleteAll() * chore: update webui build output	2025-12-15 11:29:29 +01:00
Saba Fallah	8ad98ee6f5	editorconfig-check fix	2025-12-15 10:40:09 +01:00
Saba Fallah	b3bf8cba05	Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr # Conflicts: # convert_hf_to_gguf.py	2025-12-15 10:19:50 +01:00
Johannes Gäßler	b1f3a6e5db	llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653 ) * llama: automatically fit args to free memory llama-fit-params tool * fix CI * hints for bug reports, ensure no reallocation * fix segfault with Vulkan * add llama-fit-params to CI * fix CI * fix CI * fix CI * minor adjustments * fix assignment of 1 dense layer * fix logger not being reset on model load failure * remove --n-gpu-layer hint on model load failure * fix llama-fit-params verbosity * fix edge case * fix typo [no ci]	2025-12-15 09:24:59 +01:00
piDack	745fa0e78b	model : add glm-asr support (#17901 ) * [model] add glm-asr support * fix format for ci * fix convert format for ci * update glm_asr convert script & use build_ffn for glm_asr clip & use build_stack for padding and review * check root architecture for convert hf script * fix conficlt with upstream * fix convert script for glm asr & format clip-impl * format * restore hparams text * improved conversion --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2025-12-15 03:18:46 +01:00
Saba Fallah	7f8621c5fb	minor formatting	2025-12-14 16:44:23 +01:00
Saba Fallah	3fc61d4814	Merge pull request #10 from sfallah/sf/deepseek-ocr-test-script python test script for deepseek-ocr testing OCR on text-1.jpeg newspaper image checking against expected reference model output for Free-OCR and Markdown	2025-12-14 16:42:27 +01:00
Saba Fallah	dc2066e535	check with fixed expected resutls	2025-12-14 16:32:36 +01:00
Haowei Wu	37f5a1093b	mtmd: enhance image resizing in llava_uhd (#18014 )	2025-12-14 15:57:52 +01:00
Saba Fallah	6c36c03815	minor formatting fixes	2025-12-14 15:14:32 +01:00
Georgi Gerganov	254098a279	common : refactor common_sampler + grammar logic changes (#17937 ) * common : refactor common_sampler + grammar logic changes * tests : increase max_tokens to get needed response * batched : fix uninitialized samplers	2025-12-14 10:11:13 +02:00
Sergey Fedorov	4ed2bae50d	server-models.cpp: add missing <filesystem> (#18000 ) Fixes: https://github.com/ggml-org/llama.cpp/issues/17999	2025-12-13 22:02:43 +01:00
Saba Fallah	fb3bb6aabb	added deepseek-ocr test to tests.sh	2025-12-13 17:37:58 +01:00
Saba Fallah	f7736f23ef	refactoring, one single builder function and static helpers	2025-12-13 17:13:32 +01:00
Saba Fallah	f95a6fe9f3	quick and (potential) dirty merge with https://github.com/ggml-org/llama.cpp/pull/17909	2025-12-13 13:52:46 +01:00
Xuan-Son Nguyen	4d5ae24c0a	arg: fix common_params_parse not accepting negated arg (#17991 )	2025-12-13 12:53:37 +01:00
Saba Fallah	e0e69fd3fb	Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr-merge_#17965 # Conflicts: # src/llama-kv-cache.cpp # tools/mtmd/clip.cpp	2025-12-13 10:59:46 +01:00
Xuan-Son Nguyen	380b4c984e	common: support negated args (#17919 ) * args: support negated args * update docs * fix typo * add more neg options * Apply suggestions from code review Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * rm duplicated arg * fix LLAMA_ARG_NO_HOST * add test --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2025-12-12 23:58:53 +01:00
Xuan-Son Nguyen	e39a2ce66d	clip: move model cgraphs into their own files (#17965 ) * clip: move model cgraphs into their own files * more explicit enums * fix linux build * fix naming * missing headers * nits: add comments for contributors	2025-12-12 21:14:48 +01:00
Xuan-Son Nguyen	17158965ac	mtmd: explicitly forbidden inclusion of private header and libcommon (#17946 )	2025-12-12 15:16:06 +01:00
Aleksander Grygier	12280ae905	webui: Fix parsing non-LaTeX occurrencies of `$` or `$` (#17810 ) * fix: Improve latex protection logic to prevent turning non-latex `\(` into `$` * chore: update webui build output	2025-12-12 15:13:36 +01:00
Xuan-Son Nguyen	54a0fee4b7	arg: add -mm and -mmu as short form of --mmproj and --mmproj-url (#17958 ) * arg: add -mm and -mmu as short form of --mmproj and --mmproj-url * correct order * update docs	2025-12-12 14:06:06 +01:00

1 2 3 4 5 ...

511 Commits