llama.cpp

History

hanishkvc e52a7aa304 SimpleSallap:SimpleProxy: MultiThreading Given that default HTTPServer handles only one connection and inturn request at any given time, so if a client opens connection and then doesnt do anything with it, it will block other clients by putting their requests into network queue for long. So to overcome the above issue switch to ThreadingHTTPServer, which starts a new thread for each request. Given that previously ssl wrapping was done wrt the main server socket, even with switching to ThreadingHTTPServer, the handshake for ssl/tls still occurs in the main thread before a child thread is started for parallel request handling, thus the ssl handshake phase blocking other client requests. So now avoid wrapping ssl wrt the main server socket, instead wait for ThreadingHttpServer to start the new thread for a client request ie after a connection is accepted for the client, before trying to wrap the connection in ssl. This ensures that the ssl handshake occurs in this child (ie client request related) thread. So some rogue entity opening a http connection and not doing ssl handshake wont block. Inturn in this case the rfile and wfile instances within the proxy handler need to be remapped to the new ssl wrapped socket.		2025-12-05 01:53:12 +05:30
..
batched-bench	batched-bench : add "separate text gen" mode (#17103 )	2025-11-10 12:59:29 +02:00
cvector-generator	cmake : Do not install tools on iOS targets (#15903 )	2025-09-16 09:54:44 +07:00
export-lora	cmake : Do not install tools on iOS targets (#15903 )	2025-09-16 09:54:44 +07:00
gguf-split	ci : use smaller model (#16168 )	2025-09-22 09:11:39 +03:00
imatrix	Manually link -lbsd to resolve flock symbol on AIX (#16610 )	2025-10-23 19:37:31 +08:00
llama-bench	bench : cache the llama_context state at computed depth (#16944 )	2025-11-07 21:23:11 +02:00
main	cli: add migration warning (#17620 )	2025-11-30 15:32:43 +01:00
mtmd	mtmd: fix --no-warmup (#17695 )	2025-12-02 22:48:08 +01:00
perplexity	perplexity : show more kl-divergence data (#16321 )	2025-09-29 09:30:45 +03:00
quantize	ci : use smaller model (#16168 )	2025-09-22 09:11:39 +03:00
rpc	Install rpc-server when GGML_RPC is ON. (#17149 )	2025-11-11 10:53:59 +00:00
run	Manually link -lbsd to resolve flock symbol on AIX (#16610 )	2025-10-23 19:37:31 +08:00
server	SimpleSallap:SimpleProxy: MultiThreading	2025-12-05 01:53:12 +05:30
tokenize	cmake : Do not install tools on iOS targets (#15903 )	2025-09-16 09:54:44 +07:00
tts	model : Apertus model implementation (#15852 )	2025-10-02 20:43:22 +03:00
CMakeLists.txt	mtmd : rename llava directory to mtmd (#13311 )	2025-05-05 16:02:55 +02:00