llama.cpp/ggml/src/ggml-cann
Georgi Gerganov a70c8a0c4b
kv-cache : use ggml_set_rows (#14285)
* kv-cache : use ggml_set_rows

ggml-ci

* graph : separate k and v indices

ggml-ci

* cont : remove redundant ifs

ggml-ci

* kv-cache : improve find_slot impl

* kv-cache : bounds-check when accessing slot_info indices

* kv-cache : add comments

ggml-ci

* ggml : add TODOs for adding GGML_OP_SET_ROWS support in the backends

ggml-ci
2025-07-03 10:53:35 +03:00
..
CMakeLists.txt CANN: Add SOC TYPE printing in cmake configuration (#13837) 2025-05-28 11:54:20 +08:00
Doxyfile CANN: Add the basic supports of Flash Attention kernel (#13627) 2025-05-26 10:20:18 +08:00
acl_tensor.cpp CANN: Add the basic supports of Flash Attention kernel (#13627) 2025-05-26 10:20:18 +08:00
acl_tensor.h CANN: Add the basic supports of Flash Attention kernel (#13627) 2025-05-26 10:20:18 +08:00
aclnn_ops.cpp CANN: update aclnnGroupedMatmulV2 to aclnnGroupedMatmulV3 (#14411) 2025-07-01 16:47:30 +08:00
aclnn_ops.h CANN: Add the basic supports of Flash Attention kernel (#13627) 2025-05-26 10:20:18 +08:00
common.h fix async_mode bug (#14432) 2025-06-28 17:35:41 +08:00
ggml-cann.cpp kv-cache : use ggml_set_rows (#14285) 2025-07-03 10:53:35 +03:00