llama.cpp/common
Georgi Gerganov 196f5083ef
common : more accurate sampling timing (#17382)
* common : more accurate sampling timing

* eval-callback : minor fixes

* cont : add time_meas impl

* cont : fix log msg [no ci]

* cont : fix multiple definitions of time_meas

* llama-cli : exclude chat template init from time measurement

* cont : print percentage of unaccounted time

* cont : do not reset timings
2025-11-20 13:40:10 +02:00
..
CMakeLists.txt
arg.cpp
arg.h
base64.hpp
build-info.cpp.in
chat-parser-xml-toolcall.cpp
chat-parser-xml-toolcall.h
chat-parser.cpp
chat-parser.h
chat.cpp chat: fix int overflow, prevent size calculation in float/double (#17357) 2025-11-18 19:11:53 +01:00
chat.h
common.cpp common : more accurate sampling timing (#17382) 2025-11-20 13:40:10 +02:00
common.h common : more accurate sampling timing (#17382) 2025-11-20 13:40:10 +02:00
console.cpp
console.h
download.cpp
download.h
http.h
json-partial.cpp
json-partial.h
json-schema-to-grammar.cpp
json-schema-to-grammar.h
llguidance.cpp
log.cpp
log.h
ngram-cache.cpp
ngram-cache.h
regex-partial.cpp
regex-partial.h
sampling.cpp common : more accurate sampling timing (#17382) 2025-11-20 13:40:10 +02:00
sampling.h
speculative.cpp
speculative.h