llama.cpp

Commit Graph

Author	SHA1	Message	Date
Saba Fallah	6978c37fe6	Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr	2026-02-02 12:09:28 +01:00
Saba Fallah	a94c241751	merge resolved - fixed issues in convert - tested several deepseek models	2026-02-02 12:07:35 +01:00
tc-mb	ec6c7421e4	mtmd: support MiniCPM-o 4.5(vision only) (#19211 ) Signed-off-by: tc-mb <caitianchi@modelbest.cn>	2026-01-30 23:19:30 +01:00
Saba Fallah	ded92076a8	Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr # Conflicts: # convert_hf_to_gguf.py # gguf-py/gguf/gguf_writer.py # gguf-py/gguf/tensor_mapping.py # src/llama-model.cpp # src/models/deepseek2.cpp # tools/mtmd/CMakeLists.txt # tools/mtmd/clip-impl.h # tools/mtmd/clip.cpp # tools/mtmd/clip.h	2026-01-28 13:39:39 +01:00
Piotr Wilkin (ilintar)	d98b548120	Restore clip's cb() to its rightful glory - extract common debugging elements in llama (#17914 ) * Extract common debugging functions; plug eval-callback and mtmd's MTMD_DEBUG_GRAPH with same functionality * Move to common * Remove unneeded header * Unlink from common * chore: update webui build output * Cleanup; properly pass params to mtmd without depending on common; factorize debug.cpp to use common debug code. * Revert change to webapp * Post-merge adjust * Apply suggestions from code review Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> * Apply code review changes * Remove changes to server-context * Remove mtmd.h include * Remove utility functions from header * Apply suggestions from code review Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> * Rename functions * Update tools/mtmd/clip.cpp Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> * Update tools/mtmd/clip.cpp Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> * Update tools/mtmd/clip.cpp Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>	2026-01-14 20:29:35 +01:00
Xuan-Son Nguyen	e047f9ee9d	mtmd: fix use_non_causal being reported incorrectly (#18793 ) * mtmd: fix use_non_causal being reported incorrectly * move clip_is_mrope to mtmd_decode_use_mrope * fix sloppy code ggml_cpy	2026-01-13 12:19:38 +01:00
Simranjeet Singh	a61c8bc3bf	mtmd: Add Gemma3n multimodal support with MobileNetV5 vision encoder (#18256 ) * Add Gemma3nVisionModel - MobileNetV5 vision encoder convertor to convert_hf_to_gguf.py. Add gemma3n to vision projectors in gguf-py/gguf/constants.py. * Add mobilenetv5 impl * Fix comments, remove unused vars * Fix permute and remove transpose of projection weights * Fix comments, remove debugging prints from hf_to_gguf * 1. Hard-code image_mean = 0 and image_std = 1 2. Use available tensor mapping logic 3. Remove redundant chat template replacement of soft tokens placeholder with media placeholder * 1. Move mobilenetv5 helpers declarations to `clip_graph_mobilenetv5` struct and definitions to mobilenetv5.cpp 2.Remove unused `clip_is_gemma3n` func declarations and definitions 3. Remove redundant `rescale_image_u8_to_f32` func and use `normalize_image_u8_to_f32` with zero mean and unit std 4. Calculate n_patches using image_size / patch_size * Remove obsolete comments * - convert_hf_to_gguf.py & constants.py & tensor_mapping.py: Use explicit mapping: Custom map for double indexed blocks and tensor_mapping.py for rest - convert_hf_to_gguf.py: Unsqueeze Stem Bias and Layer scale tensors to correct shape while converting to gguf - mobilenetv5.cpp: Remove explicit reshaping of Stem Bias and Layer scale which are now handled while converting to gguf, replace fprintf with LOG_* - clip.cpp: Remove unused embedding and hard_emb_norm tensor loading * - Rename tensors to v.conv..., v.blk..., v.msfa... to better align with already existing terminology * Fix stem conv bias name * Remove explicit handling of bias term for stem conv * - Change order of addition in "project_per_layer_inputs" to support broadcasting of vision inp_per_layer - Simplify the vision embeddings path of "get_per_layer_inputs" to output [n_embd_altup, n_layer, 1], broadcastable * clean up conversion script * fix code style * also preserve audio tensors * trailing space * split arch A and V * rm unused gemma3 func * fix alignment --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2026-01-09 23:42:38 +01:00
Tarek Dakhran	4974bf53cf	model : mtmd : make input norm optional in LFM2-VL (#18594 ) Upcoming LFM2-VL releases will have configurable input norm. See https://github.com/huggingface/transformers/pull/43087 for details.	2026-01-04 18:50:02 +01:00
tt	ced765be44	model: support youtu-vl model (#18479 ) * Support Youtu-VL Model * merge code * fix bug * revert qwen2 code & support rsplit in minja.hpp * update warm info * fix annotation * u * revert minja.hpp * fix * Do not write routed_scaling_factor to gguf when routed_scaling_factor is None * fix expert_weights_scale * LGTM after whitespace fixes * fix * fix * fix * layers to layer_index * enum fix --------- Co-authored-by: Xuan-Son Nguyen <son@huggingface.co> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-01-01 19:25:54 +01:00
Henry147147	9b8329de7a	mtmd : Adding support for Nvidia Music Flamingo Model (#18470 ) * Inital commit, debugging q5_k_s quant * Made hf_to_gguf extend whisper to reduce code duplication * addressed convert_hf_to_gguf pull request issue --------- Co-authored-by: Henry D <henrydorsey147@gmail.com>	2025-12-31 12:13:23 +01:00
Saba Fallah	4d91711e5c	fixed merge build issue	2025-12-19 11:14:36 +01:00
Saba Fallah	9a05e1d116	Merge branch 'master' into sf/deepseek-ocr	2025-12-19 11:08:29 +01:00
Xuan-Son Nguyen	8ea958d4d9	model : add ASR support for LFM2-Audio-1.5B (conformer) (#18106 ) * ASR with LFM2-Audio-1.5B * Set rope_theta * Fix comment * Remove rope_theta setting * Address PR feedback * rename functions to conformer * remove some redundant ggml_cont * fix missing tensor * add prefix "a." for conv tensors * remove redundant reshape * clean up * add test model --------- Co-authored-by: Tarek Dakhran <tarek@liquid.ai>	2025-12-19 00:18:01 +01:00
bluebread	5a741fda55	mtmd: format code	2025-12-17 03:26:38 +00:00
Saba Fallah	512b2c8fe4	merge with changes from https://github.com/ggml-org/llama.cpp/pull/18042	2025-12-16 14:07:04 +01:00
Saba Fallah	51c3de6887	Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr # Conflicts: # gguf-py/gguf/constants.py # gguf-py/gguf/tensor_mapping.py # tools/mtmd/clip-impl.h # tools/mtmd/clip.cpp # tools/mtmd/models/models.h	2025-12-16 12:16:25 +01:00
Xuan-Son Nguyen	3d86c6c2b5	model: support GLM4V vision encoder (#18042 ) * convert ok * no deepstack * less new tensors * cgraph ok * add mrope for text model * faster patch merger * add GGML_ROPE_TYPE_MRNORM * add support for metal * move glm4v do dedicated graph * convert: add norm_embd * clip: add debugging fn * working correctly * fix style * use bicubic * fix mrope metal * improve cpu * convert to neox ordering on conversion * revert backend changes * force stop if using old weight * support moe variant * fix conversion * fix convert (2) * Update tools/mtmd/clip-graph.h Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * process mrope_section on TextModel base class * resolve conflict merge --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-12-16 11:25:26 +01:00
Saba Fallah	4a4f82968c	Merge branch 'ggml-org:master' into sf/deepseek-ocr	2025-12-16 09:09:52 +01:00
Xuan-Son Nguyen	96a181a933	mtmd: refactor audio preprocessing (#17978 ) * mtmd: refactor audio preprocessing * refactor Co-authored-by: Tarek <tdakhran@users.noreply.github.com> * wip * wip (2) * improve constructor * fix use_natural_log * fix padding for short input * clean up * remove need_chunking --------- Co-authored-by: Tarek <tdakhran@users.noreply.github.com>	2025-12-15 14:16:52 +01:00
Saba Fallah	b3bf8cba05	Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr # Conflicts: # convert_hf_to_gguf.py	2025-12-15 10:19:50 +01:00
piDack	745fa0e78b	model : add glm-asr support (#17901 ) * [model] add glm-asr support * fix format for ci * fix convert format for ci * update glm_asr convert script & use build_ffn for glm_asr clip & use build_stack for padding and review * check root architecture for convert hf script * fix conficlt with upstream * fix convert script for glm asr & format clip-impl * format * restore hparams text * improved conversion --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2025-12-15 03:18:46 +01:00
Haowei Wu	37f5a1093b	mtmd: enhance image resizing in llava_uhd (#18014 )	2025-12-14 15:57:52 +01:00
Saba Fallah	6c36c03815	minor formatting fixes	2025-12-14 15:14:32 +01:00
Saba Fallah	f95a6fe9f3	quick and (potential) dirty merge with https://github.com/ggml-org/llama.cpp/pull/17909	2025-12-13 13:52:46 +01:00
Saba Fallah	e0e69fd3fb	Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr-merge_#17965 # Conflicts: # src/llama-kv-cache.cpp # tools/mtmd/clip.cpp	2025-12-13 10:59:46 +01:00
Xuan-Son Nguyen	e39a2ce66d	clip: move model cgraphs into their own files (#17965 ) * clip: move model cgraphs into their own files * more explicit enums * fix linux build * fix naming * missing headers * nits: add comments for contributors	2025-12-12 21:14:48 +01:00
Saba Fallah	d70f171fac	merge with changes from https://github.com/ggml-org/llama.cpp/pull/17909 added new opt to tests.sh to disable flash-attn	2025-12-11 10:11:27 +01:00
Saba Fallah	33fabf0bd8	Merge branch 'master' into sf/deepseek-ocr-merge-test # Conflicts: # tools/mtmd/clip.cpp # tools/mtmd/mtmd-cli.cpp	2025-12-11 08:13:50 +01:00
Xuan-Son Nguyen	c6b2c9310c	mtmd: some small clean up (#17909 ) * clip: add support for fused qkv in build_vit * use bulid_ffn whenever possible * fix internvl * mtmd-cli: move image to beginning * test script: support custom args	2025-12-10 22:20:06 +01:00
Saba Fallah	ed944cd25b	fix: test-1.jpg ORC issue with small (640) resolution setting min-resolution base (1024) max large (1280) for dynamic-resolution	2025-12-10 20:20:55 +01:00
Georgi Gerganov	4dff236a52	ggml : remove GGML_KQ_MASK_PAD constant (#17910 ) * ggml : remove GGML_KQ_MASK_PAD constant * cont : remove comment	2025-12-10 20:53:16 +02:00
bluebread	5174a1e69a	mtmd: minor fix	2025-12-08 04:54:19 +00:00
bluebread	48c6cf2132	mtmd: convert model in FP16	2025-12-08 02:36:00 +00:00
bluebread	53273f83f8	mtmd: fixed wrong input setting	2025-12-07 23:58:22 +00:00
bluebread	5dfcc5abb1	mtmd: add detailed comments for resize_bicubic_pillow	2025-12-07 10:15:09 +00:00
bluebread	2d918b3e21	mtmd: make sam hparams configurable	2025-12-06 06:55:53 +00:00
bluebread	15f2ada0ed	mtmd: simplify get_rel_pos	2025-12-06 06:32:41 +00:00
Saba Fallah	d981f19e9d	minor editorconfig-check fixes	2025-12-05 13:18:15 +01:00
Saba Fallah	5f2ee1aecf	Merge branch 'ggml-org:master' into sf/deepseek-ocr	2025-12-05 11:56:06 +01:00
Saba Fallah	f5bd310a5e	minor formatting and style	2025-12-05 09:30:58 +01:00
Saba Fallah	076138a428	corrected code-branch when flash-attn disabled enabling usage of --flash-attn option	2025-12-04 23:45:59 +01:00
Saba Fallah	5381b9cf63	using common build_attn in sam	2025-12-04 23:13:29 +01:00
bluebread	fc3f625fef	mtmd: support combined QKV projection in buid_vit	2025-12-04 17:57:43 +00:00
Saba Fallah	a661c52990	reverting automatically removed spaces	2025-12-04 16:12:41 +01:00
Saba Fallah	c73748ab5d	Merge branch 'sf/deepseek-ocr' into sf/deepseek-ocr-cleanup # Conflicts: # gguf-py/gguf/tensor_mapping.py	2025-12-04 15:09:32 +01:00
Saba Fallah	386ba479a2	clean up	2025-12-04 15:05:58 +01:00
bluebread	7451b84105	mtmd: fix tensor names for image newlines and view separator	2025-12-04 13:26:53 +00:00
bluebread	b26b507c4e	mtmd: refactor code & remove unused helper functions	2025-12-03 16:23:46 +00:00
bluebread	b696c54756	mtmd: remove --dsocr-mode argument	2025-12-03 14:54:16 +00:00
bluebread	43dfc0c8d6	Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into sf/deepseek-ocr	2025-12-03 07:52:26 +00:00

1 2 3

139 Commits