llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git

History

Radoslav Gerganov 15d2b46b4d rpc : cache and reuse compute graphs (#15405 ) Store the last computed graph and reuse it when possible. Also do not return response from GRAPH_COMPUTE and assume it always completes successfully. If this this is not the case, the server closes the connection. This saves us a network round trip to the server.		2025-11-28 08:33:51 +00:00
..
CMakeLists.txt	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-rpc.cpp	rpc : cache and reuse compute graphs (#15405 )	2025-11-28 08:33:51 +00:00