Georgi Gerganov
cc45f2ada6
models : deduplicate delta-net graphs for Qwen family ( #19597 )
...
* models : add llm_build_delta_net_base
* cont : keep qwen35 and qwen35moe graphs intact
* cont : add comments
2026-02-16 14:35:04 +02:00
Daniel Bevenius
9da3dcd753
llama : clarify nemotron-h.cpp comment about RoPE [no ci] ( #18997 )
...
This commit removes the mention of RoPE in the comment for the Q and K
computation as RoPE is not applied.
2026-01-21 18:31:34 +01:00
Daniel Bevenius
2995341730
llama : add support for NVIDIA Nemotron 3 Nano ( #18058 )
...
* llama : add support for NVIDIA Nemotron Nano 3
This commit adds support for the NVIDIA Nemotron Nano 3 model, enabling
the conversion and running of this model.
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-12-16 07:19:26 +01:00
Piotr Wilkin (ilintar)
bea04522ff
refactor : llama-model.cpp ( #16252 )
...
* Sqashed: llama-model.cpp refactoring
* Fix formatting of attn / ffn / ffn_moe calls
* Fix import regression / unify spacing in models.h
* totally DID NOT miss those!
* Add missing qwen3vl(moe) models
* Add missing new .cpp files to build
* Remove extra semicolons
* Editor checker
* Update src/models/models.h
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-10-31 23:40:23 +01:00