llama.cpp

History

Doctor Shotgun 9a5724dee2 ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH (#18535 ) * ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH * makes the min_batch_size for triggering op offload configurable via env var, defaulting to the prior hardcoded value of 32 * ggml: read GGML_OP_OFFLOAD_MIN_BATCH once and store to dev ctx * cann: forward declaration of device context struct * cann: move offload op check after device context declaration * cuda: fix whitespace Co-authored-by: Aman Gupta <amangupta052@gmail.com> --------- Co-authored-by: Aman Gupta <amangupta052@gmail.com>		2026-01-08 11:03:21 +02:00
..
CMakeLists.txt	CANN: add support for ACL Graph (#15065 )	2025-08-06 14:12:42 +08:00
acl_tensor.cpp	CANN: Use smart pointers to manage ACL objects (#17238 )	2025-11-17 08:43:59 +08:00
acl_tensor.h	CANN: Use smart pointers to manage ACL objects (#17238 )	2025-11-17 08:43:59 +08:00
aclnn_ops.cpp	CANN: Fix rename for get_env (#18652 )	2026-01-07 16:11:31 +08:00
aclnn_ops.h	CANN: add operator fusion support for ADD + RMS_NORM (#17512 )	2026-01-05 15:38:18 +08:00
common.h	CANN: Fix rename for get_env (#18652 )	2026-01-07 16:11:31 +08:00
ggml-cann.cpp	ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH (#18535 )	2026-01-08 11:03:21 +02:00