zhou.weiguo
|
5269e082aa
|
ggml-qnn: refine ggml inference using QNN NPU
|
2024-06-11 23:05:00 +08:00 |
zhou.weiguo
|
5f8cfe4a1e
|
ggml-qnn: refine source code of ggml-qnn.cpp to make reviewer more happy
|
2024-06-10 20:07:26 +08:00 |
zhou.weiguo
|
d38d4a67d1
|
npu: probe htp info and capacity of rpc ion memory
|
2024-06-09 23:49:54 +08:00 |
zhou.weiguo
|
3e8b61f970
|
review: fix a memory leak introduced by review modification which explained in https://github.com/zhouwg/llama.cpp/pull/1
|
2024-06-09 09:06:44 +08:00 |
zhou.weiguo
|
fdf0272dfb
|
review: code format using clang-format + manually modification according to review comments
|
2024-06-08 17:56:32 +08:00 |
zhou.weiguo
|
5d691c6cd0
|
review: put qnn's internal log inside preprocessor diretive
|
2024-06-08 09:22:39 +08:00 |
zhou.weiguo
|
94ee775058
|
review: remove static global vars to support multi-instance simultaneously and thread safe
|
2024-06-07 14:56:07 +08:00 |
zhou.weiguo
|
2fab33d825
|
ggml-qnn: remove static global vars to support multi-instance simultaneously
|
2024-06-07 12:51:04 +08:00 |
zhou.weiguo
|
f4c53037ab
|
review: remove unused QNN helper functions
|
2024-06-06 20:24:03 +08:00 |
zhou.weiguo
|
dd29834c11
|
add supportive of quantize data type Q8_0
|
2024-06-06 17:12:28 +08:00 |
zhou.weiguo
|
926a8661f3
|
review: replace external declaration with NDK header file
|
2024-06-05 21:10:59 +08:00 |
zhou.weiguo
|
d325088dbf
|
ggml: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend
|
2024-06-05 10:55:45 +08:00 |