| .. |
|
CMakeLists.txt
|
common : reorganize includes to prioritize vendored deps (#18222)
|
2025-12-20 21:43:21 -06:00 |
|
arg.cpp
|
server: add auto-sleep after N seconds of idle (#18228)
|
2025-12-21 02:24:42 +01:00 |
|
arg.h
|
server: support load model on startup, support preset-only options (#18206)
|
2025-12-20 09:25:27 +01:00 |
|
base64.hpp
|
llava : expose as a shared library for downstream projects (#3613)
|
2023-11-07 00:36:23 +03:00 |
|
build-info.cpp.in
|
cmake: Add ability to pass in LLAMA_BUILD_NUMBER/COMMIT (#14167)
|
2025-06-13 10:38:52 +02:00 |
|
chat-parser-xml-toolcall.cpp
|
Fix Kimi-K2 tool-call parsing issues (#17376)
|
2025-12-08 14:32:04 +01:00 |
|
chat-parser-xml-toolcall.h
|
Fix Kimi-K2 tool-call parsing issues (#17376)
|
2025-12-08 14:32:04 +01:00 |
|
chat-parser.cpp
|
Fix Kimi-K2 tool-call parsing issues (#17376)
|
2025-12-08 14:32:04 +01:00 |
|
chat-parser.h
|
common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932)
|
2025-11-18 18:54:15 +01:00 |
|
chat-peg-parser.cpp
|
common : add nemotron 3 parsing (#18077)
|
2025-12-16 04:05:23 -06:00 |
|
chat-peg-parser.h
|
common : introduce composable PEG parser combinators for chat parsing (#17136)
|
2025-12-03 12:45:32 +02:00 |
|
chat.cpp
|
common : add nemotron 3 parsing (#18077)
|
2025-12-16 04:05:23 -06:00 |
|
chat.h
|
chat : reserve memory in compute_diffs and improve naming (#17729)
|
2025-12-03 17:22:10 +02:00 |
|
common.cpp
|
common: clarify instructions for bug reports (#18134)
|
2025-12-17 18:44:13 +01:00 |
|
common.h
|
server: add auto-sleep after N seconds of idle (#18228)
|
2025-12-21 02:24:42 +01:00 |
|
console.cpp
|
cli: new CLI experience (#17824)
|
2025-12-10 15:28:59 +01:00 |
|
console.h
|
cli: new CLI experience (#17824)
|
2025-12-10 15:28:59 +01:00 |
|
download.cpp
|
common : add minimalist multi-thread progress bar (#17602)
|
2025-12-12 12:44:35 +01:00 |
|
download.h
|
server: introduce API for serving / loading / unloading multiple models (#17470)
|
2025-12-01 19:41:04 +01:00 |
|
http.h
|
common: introduce http.h for httplib-based client (#16373)
|
2025-10-01 20:22:18 +03:00 |
|
json-partial.cpp
|
common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932)
|
2025-11-18 18:54:15 +01:00 |
|
json-partial.h
|
sync : vendor (#13901)
|
2025-05-30 16:25:45 +03:00 |
|
json-schema-to-grammar.cpp
|
common : add nemotron 3 parsing (#18077)
|
2025-12-16 04:05:23 -06:00 |
|
json-schema-to-grammar.h
|
common : add nemotron 3 parsing (#18077)
|
2025-12-16 04:05:23 -06:00 |
|
llguidance.cpp
|
llguidance : set tokenizer slices to default (#13424)
|
2025-05-10 17:19:52 +02:00 |
|
log.cpp
|
cli: new CLI experience (#17824)
|
2025-12-10 15:28:59 +01:00 |
|
log.h
|
cli: new CLI experience (#17824)
|
2025-12-10 15:28:59 +01:00 |
|
ngram-cache.cpp
|
ggml : portability fixes for VS 2017 (#12150)
|
2025-03-04 18:53:26 +02:00 |
|
ngram-cache.h
|
llama : use LLAMA_TOKEN_NULL (#11062)
|
2025-01-06 10:52:15 +02:00 |
|
peg-parser.cpp
|
common : add nemotron 3 parsing (#18077)
|
2025-12-16 04:05:23 -06:00 |
|
peg-parser.h
|
common : introduce composable PEG parser combinators for chat parsing (#17136)
|
2025-12-03 12:45:32 +02:00 |
|
preset.cpp
|
server: support load model on startup, support preset-only options (#18206)
|
2025-12-20 09:25:27 +01:00 |
|
preset.h
|
presets: refactor, allow cascade presets from different sources, add global section (#18169)
|
2025-12-19 12:08:20 +01:00 |
|
regex-partial.cpp
|
`common`: add partial regex support (#12808)
|
2025-05-14 19:50:57 +01:00 |
|
regex-partial.h
|
`common`: add partial regex support (#12808)
|
2025-05-14 19:50:57 +01:00 |
|
sampling.cpp
|
common : restore grammar-based rejection sampling (#18137)
|
2025-12-17 19:46:00 +02:00 |
|
sampling.h
|
common : restore grammar-based rejection sampling (#18137)
|
2025-12-17 19:46:00 +02:00 |
|
speculative.cpp
|
common : restore grammar-based rejection sampling (#18137)
|
2025-12-17 19:46:00 +02:00 |
|
speculative.h
|
server : implement universal assisted decoding (#12635)
|
2025-07-31 14:25:23 +02:00 |
|
unicode.cpp
|
common : introduce composable PEG parser combinators for chat parsing (#17136)
|
2025-12-03 12:45:32 +02:00 |
|
unicode.h
|
common : introduce composable PEG parser combinators for chat parsing (#17136)
|
2025-12-03 12:45:32 +02:00 |