llama.cpp/ggml
hongruichen e97d3a6c48 fix tensor buffer allocation
add log

commit qnn buffer after changed

add log

register_rpc_mem 2 times

update input tensors before graph finalize

default to QNN_TENSORMEMTYPE_RAW

set new tensors at execute

move write input tensors to exec

check if mem registered before actual do

register rpc mem once allocated
2024-07-10 19:32:39 +08:00
..
cmake llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00
include add clang format file and reformating 2024-07-04 23:29:31 +08:00
src fix tensor buffer allocation 2024-07-10 19:32:39 +08:00
CMakeLists.txt ggml : add GGML_CUDA_USE_GRAPHS option, restore GGML_CUDA_FORCE_CUBLAS (cmake) (#8140) 2024-06-26 21:34:14 +02:00
ggml_vk_generate_shaders.py llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00