llama.cpp/ggml/src/ggml-cann
Doctor Shotgun 9a5724dee2
ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH (#18535)
* ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH
* makes the min_batch_size for triggering op offload configurable via env var, defaulting to the prior hardcoded value of 32

* ggml: read GGML_OP_OFFLOAD_MIN_BATCH once and store to dev ctx

* cann: forward declaration of device context struct

* cann: move offload op check after device context declaration

* cuda: fix whitespace

Co-authored-by: Aman Gupta <amangupta052@gmail.com>

---------

Co-authored-by: Aman Gupta <amangupta052@gmail.com>
2026-01-08 11:03:21 +02:00
..
CMakeLists.txt CANN: add support for ACL Graph (#15065) 2025-08-06 14:12:42 +08:00
acl_tensor.cpp CANN: Use smart pointers to manage ACL objects (#17238) 2025-11-17 08:43:59 +08:00
acl_tensor.h CANN: Use smart pointers to manage ACL objects (#17238) 2025-11-17 08:43:59 +08:00
aclnn_ops.cpp CANN: Fix rename for get_env (#18652) 2026-01-07 16:11:31 +08:00
aclnn_ops.h CANN: add operator fusion support for ADD + RMS_NORM (#17512) 2026-01-05 15:38:18 +08:00
common.h CANN: Fix rename for get_env (#18652) 2026-01-07 16:11:31 +08:00
ggml-cann.cpp ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH (#18535) 2026-01-08 11:03:21 +02:00