* ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH * makes the min_batch_size for triggering op offload configurable via env var, defaulting to the prior hardcoded value of 32 * ggml: read GGML_OP_OFFLOAD_MIN_BATCH once and store to dev ctx * cann: forward declaration of device context struct * cann: move offload op check after device context declaration * cuda: fix whitespace Co-authored-by: Aman Gupta <amangupta052@gmail.com> --------- Co-authored-by: Aman Gupta <amangupta052@gmail.com> |
||
|---|---|---|
| .. | ||
| CMakeLists.txt | ||
| acl_tensor.cpp | ||
| acl_tensor.h | ||
| aclnn_ops.cpp | ||
| aclnn_ops.h | ||
| common.h | ||
| ggml-cann.cpp | ||