llama.cpp/vendor/cpp-httplib
Radoslav Gerganov 3ace5fa277 server : add special handling for /health in httplib
When the number of parallel requests to llama-server exceed the number
of http threads, llama-server stop responding to /health which is very
disruptive in k8s deployments, causing restarts of properly working
inference endpoints.

Unfortunately, there is no way to fix this outside of httplib and this
patch adds a rather ugly hack for handling GET /health requests before
dispatching them to the thread pool.

No changes are made in the HTTPS implementation.

closes: #20684
2026-03-20 15:44:06 +02:00
..
CMakeLists.txt vendor : update cpp-httplib to 0.35.0 (#19969) 2026-02-28 13:53:56 +01:00
LICENSE common : add --license to display embedded licenses (#18696) 2026-01-10 09:46:24 +01:00
httplib.cpp server : add special handling for /health in httplib 2026-03-20 15:44:06 +02:00
httplib.h server : add special handling for /health in httplib 2026-03-20 15:44:06 +02:00