Commit Graph

26 Commits

Author SHA1 Message Date
hongruichen 3fe07eb907 fix compiling error 2024-06-19 15:10:45 +08:00
hongruichen 3c491a3263 remove reference of g_qnn_mgr in qnn_instance 2024-06-19 14:45:43 +08:00
hongruichen 99320620b0 split logger function, tensors and backend from main qnn source 2024-06-19 14:39:16 +08:00
hongruichen dfe159ffff remove TODO 2024-06-19 11:16:12 +08:00
hongruichen aeef0c68f4 make the constant condition first 2024-06-19 10:29:53 +08:00
hongruichen 65a14d9e9a fix todo 2024-06-18 23:09:04 +08:00
hongruichen 9456bba121 rename 2024-06-17 18:44:19 +08:00
hongruichen 5fe7b87ba1 use ggml_qnn_tensor_writer for all parameters 2024-06-17 11:17:46 +08:00
hongruichen a5679ddd8e use ggml_qnn_tensor_reader for output tensor 2024-06-16 22:28:11 +08:00
hongruichen 36e41a1055 use tensor wrapper in matmul 2024-06-16 22:28:11 +08:00
hongruichen 37bb9263dd use tensor wrapper in add 2024-06-16 22:28:11 +08:00
hongruichen 6c68adc1d9 add ggml_qnn_tensor_binder 2024-06-16 22:28:10 +08:00
zhou.weiguo 5598fbd15d
review: make a MVP(Minimum Viable PR) style PR in upstream 2024-06-13 15:41:53 +08:00
zhou.weiguo faaa86b7e4
ggml-qnn: refine ggml inference using QNN NPU 2024-06-12 16:30:50 +08:00
zhou.weiguo 5269e082aa
ggml-qnn: refine ggml inference using QNN NPU 2024-06-11 23:05:00 +08:00
zhou.weiguo 5f8cfe4a1e
ggml-qnn: refine source code of ggml-qnn.cpp to make reviewer more happy 2024-06-10 20:07:26 +08:00
zhou.weiguo d38d4a67d1
npu: probe htp info and capacity of rpc ion memory 2024-06-09 23:49:54 +08:00
zhou.weiguo 3e8b61f970
review: fix a memory leak introduced by review modification which explained in https://github.com/zhouwg/llama.cpp/pull/1 2024-06-09 09:06:44 +08:00
zhou.weiguo fdf0272dfb
review: code format using clang-format + manually modification according to review comments 2024-06-08 17:56:32 +08:00
zhou.weiguo 5d691c6cd0
review: put qnn's internal log inside preprocessor diretive 2024-06-08 09:22:39 +08:00
zhou.weiguo 94ee775058
review: remove static global vars to support multi-instance simultaneously and thread safe 2024-06-07 14:56:07 +08:00
zhou.weiguo 2fab33d825
ggml-qnn: remove static global vars to support multi-instance simultaneously 2024-06-07 12:51:04 +08:00
zhou.weiguo f4c53037ab
review: remove unused QNN helper functions 2024-06-06 20:24:03 +08:00
zhou.weiguo dd29834c11
add supportive of quantize data type Q8_0 2024-06-06 17:12:28 +08:00
zhou.weiguo 926a8661f3
review: replace external declaration with NDK header file 2024-06-05 21:10:59 +08:00
zhou.weiguo d325088dbf
ggml: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend 2024-06-05 10:55:45 +08:00