llama.cpp

History

Kevin Pouget ffaafde16f ggml-virtgpu: improve the reliability of the code (#19846 ) * ggml-virtgpu-backend: validate the consistency of the received objects This patch adds consistency checks in the ggml-virtgpu-backend (running on the host side) to ensure that the data received from the guest is consistent (valid pointers, valid sizes and offsets). * ggml-virtgpu-backend: add fallback/skips for optional ggml backend methods ``` 1. bck->iface.synchronize(bck) 2. buft->iface.get_alloc_size(buft, op) 3. buft->iface.get_max_size(buft) ``` these three methods are optional in the GGML interface. `get_max_size` was already properly defaulted, but `backend sychronize` and `butf get_max_size` would have segfaulted the backend if not implemented. * ggml-virtgpu-backend: fix log format missing argument * ggml-virtgpu-backend: improve the abort message * ggml-virtgpu-backend: more safety checks * ggml-virtgpu-backend: new error code * ggml-virtgpu-backend: initialize all the error codes * ggml-virtgpu: add a missing comment generated by the code generator * ggml-virtgpu: add the '[virtgpu]' prefix to the device/buffer names * ggml-virtgpu: apir_device_buffer_from_ptr: improve the error message * ggml-virtgpu: shared: make it match the latest api_remoting.h of Virglrenderer APIR (still unmerged) * ggml-virtgpu: update the code generator to have dispatch_command_name in a host/guest shared file * ggml-virtgpu: REMOTE_CALL: fail if the backend returns an error * docs/backend/VirtGPU.md: indicate that the RAM+VRAM size is limed to 64 GB with libkrun * ggml-virtgpu: turn off clang-format header ordering for some of the files Compilation breaks when ordered alphabetically. * ggml-virtgpu: clang-format * ggml-virtgpu/backend/shared/api_remoting: better comments for the APIR return codes		2026-02-26 20:00:57 +08:00
..
android	android: fix missing screenshots for Android.md (#18156 )	2025-12-19 09:32:04 +02:00
backend	ggml-virtgpu: improve the reliability of the code (#19846 )	2026-02-26 20:00:57 +08:00
development	docs : fix links in parsing.md (#18245 )	2025-12-21 09:35:40 +01:00
multimodal	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
ops	ggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support. (#19700 )	2026-02-19 09:18:30 -07:00
android.md	android: fix missing screenshots for Android.md (#18156 )	2025-12-19 09:32:04 +02:00
build-riscv64-spacemit.md	refactor : remove libcurl, use OpenSSL when available (#18828 )	2026-01-14 18:02:47 +01:00
build-s390x.md	docs: update s390x build docs (#19643 )	2026-02-16 00:33:34 +08:00
build.md	cuda : revert CUDA_SCALE_LAUNCH_QUEUES override until investigated (#19227 )	2026-02-03 08:41:02 +02:00
docker.md	CLI: fixed adding cli and completion into docker containers, improved docs (#18003 )	2025-12-16 11:52:23 +01:00
function-calling.md	common : implement new jinja template engine (#18462 )	2026-01-16 11:22:06 +01:00
install.md	docs : add "Quick start" section for new users (#13862 )	2025-06-03 13:09:36 +02:00
llguidance.md	llguidance build fixes for Windows (#11664 )	2025-02-14 12:46:08 -08:00
multimodal.md	mtmd : add support for Voxtral (#14862 )	2025-07-28 15:01:48 +02:00
ops.md	ggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support. (#19700 )	2026-02-19 09:18:30 -07:00
preset.md	preset: allow named remote preset (#18728 )	2026-01-10 15:12:29 +01:00
speculative.md	spec : remove check rate (#19377 )	2026-02-09 15:30:50 +02:00