llama.cpp

History

EliteGPT AI 7bab4a3065 model : add Qwen3-Omni multimodal architecture support Adds support for Qwen3-Omni, Alibaba's multimodal LLM that handles text and vision. This enables the main LLM architecture and vision encoder support. Main LLM changes: - Add LLM_ARCH_QWEN3OMNI enum and architecture registration - Add hparams loading for MoE-based architecture (48 layers, 128 experts) - Reuse llm_build_qwen3moe graph builder - Add IMROPE type for multimodal position encoding Vision encoder changes (via mtmd): - Add PROJECTOR_TYPE_QWEN3O with auto-conversion to QWEN3VL for vision - Support different embedding dimensions (vision=8192, audio=2048) - Add separate Q/K/V tensor support in qwen3vl graph builder Tested with Qwen3-Omni-30B-Q8_0.gguf on distributed 5-GPU setup: - 41-44 tokens/sec inference speed - Text and vision inference working Note: Audio encoder support is WIP and will follow in a separate PR. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>		2025-12-31 20:25:55 +10:00
..
cogvlm.cpp	clip: move model cgraphs into their own files (#17965 )	2025-12-12 21:14:48 +01:00
conformer.cpp	model : add ASR support for LFM2-Audio-1.5B (conformer) (#18106 )	2025-12-19 00:18:01 +01:00
glm4v.cpp	model: support GLM4V vision encoder (#18042 )	2025-12-16 11:25:26 +01:00
internvl.cpp	clip: move model cgraphs into their own files (#17965 )	2025-12-12 21:14:48 +01:00
kimivl.cpp	clip: move model cgraphs into their own files (#17965 )	2025-12-12 21:14:48 +01:00
llama4.cpp	clip: move model cgraphs into their own files (#17965 )	2025-12-12 21:14:48 +01:00
llava.cpp	clip: move model cgraphs into their own files (#17965 )	2025-12-12 21:14:48 +01:00
minicpmv.cpp	clip: move model cgraphs into their own files (#17965 )	2025-12-12 21:14:48 +01:00
models.h	mtmd: clarify that we no longer accept AI-generated PRs (#18406 )	2025-12-28 09:57:04 +01:00
pixtral.cpp	clip: move model cgraphs into their own files (#17965 )	2025-12-12 21:14:48 +01:00
qwen2vl.cpp	clip: move model cgraphs into their own files (#17965 )	2025-12-12 21:14:48 +01:00
qwen3vl.cpp	model : add Qwen3-Omni multimodal architecture support	2025-12-31 20:25:55 +10:00
siglip.cpp	clip: move model cgraphs into their own files (#17965 )	2025-12-12 21:14:48 +01:00
whisper-enc.cpp	model : add glm-asr support (#17901 )	2025-12-15 03:18:46 +01:00