Commit Graph

12 Commits

Author SHA1 Message Date
nullname e6dbdacc32
feat: fix llama-bench (#7)
* remove unused functions

* wip

* init from last devices

* move init into constructor

* wip

* add static assert to device table

* make kDeviceCaps as constexpr

* get free memory and total memory

* add optimize flag for qnn backend
2024-11-13 17:06:46 +08:00
nullname 4abaf7d87e
feat: fix mulmat (#2)
* ggml_qnn_op_config now manager the construction of ggml_qnn_tensor

* wip

* add interface ggml_qnn_op_config

* add ggml_qnn_list_op_config

* add create_tensor and move tensor bind to execute

* wip

* rename: ggml_qnn_list_op_config -> ggml_qnn_matmul_op_config

* add tensortype to allow native tensor

* remove ggml_tensor param at ggml_qnn_tensor::create_tensor

* postpone the tensor id allocation to add_node

* add ggml_qnn_op_config_base

* trival change to reduct the param of function

* split bind_tensors into bind_input_tensors and bind_output_tensors

* implement ggml_qnn_single_op_config::create_tensors

next will set the prameter of transpose

* tensor: add bind buffer

* add parameter tensor type

* implement add_tensor_param

* set qnn_instance only at constructor

* set transpose tensor param

* move create_op_constructor into op-config module

* create QNN_OP_MAT_MUL from ggml_qnn_matmul_op_config

* try fix crash

* fix compiling error at older ndk (r23c)

* fix crash

* fix parameter tensor name

* update tensor dimension assignment and add TODO

* fix mat_mul graph creating

* fix MUL_MAT_256x16x10x1_256x1x10x1_16x1x10x1

* append type to graph cache key

* wip

* fix supported op

* update comment

* disable op other than add and mat_mul

* add convert op to adapt multi input/output format

* disable f16 for cpu backend according to official doc

https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/cpu_backend.html#supported-operations

* add supported data types flags in each backend

* remove unused functions

* append output type to graph key

* fix gpu backend by disable the different data type op

* fix cpu backend support ops

* fix duplicated tensor name

* append op name

* suppress warning

* remove unused code
2024-10-28 12:48:16 +08:00
hongruichen 867c91bfaf feat: add error string for QnnOpPackage_Error_t 2024-07-27 13:24:57 +08:00
hongruichen 27299463ae fix: try fix tensor type error 2024-07-20 15:13:10 +08:00
hongruichen 1679dcf47e fix: check all dimentions in `can offload` 2024-07-20 13:29:01 +08:00
hongruichen b1b5cc10b1 add function to convert qnn error into string 2024-07-19 22:51:17 +08:00
hongruichen bb13795dce refactoring: remove unused functions and variables 2024-07-17 14:17:35 +08:00
hongruichen 63dc587dff refactoring: make the buffer alloc and free stay in same class 2024-07-17 14:08:31 +08:00
hongruichen 4b0f6b0cd6 add helper function to get Qnn_TensorType_t from ggml_tensor 2024-07-05 19:37:58 +08:00
hongruichen 0f2e68713c move tensor related function to utils 2024-07-05 19:02:38 +08:00
hongruichen 000240cf62 add clang format file and reformating 2024-07-04 23:29:31 +08:00
hongruichen 8b677d1b2f move qnn backend into sub folder 2024-07-02 19:42:14 +08:00
Renamed from ggml-qnn/utils.cpp (Browse further)