* ggml_qnn_op_config now manager the construction of ggml_qnn_tensor * wip * add interface ggml_qnn_op_config * add ggml_qnn_list_op_config * add create_tensor and move tensor bind to execute * wip * rename: ggml_qnn_list_op_config -> ggml_qnn_matmul_op_config * add tensortype to allow native tensor * remove ggml_tensor param at ggml_qnn_tensor::create_tensor * postpone the tensor id allocation to add_node * add ggml_qnn_op_config_base * trival change to reduct the param of function * split bind_tensors into bind_input_tensors and bind_output_tensors * implement ggml_qnn_single_op_config::create_tensors next will set the prameter of transpose * tensor: add bind buffer * add parameter tensor type * implement add_tensor_param * set qnn_instance only at constructor * set transpose tensor param * move create_op_constructor into op-config module * create QNN_OP_MAT_MUL from ggml_qnn_matmul_op_config * try fix crash * fix compiling error at older ndk (r23c) * fix crash * fix parameter tensor name * update tensor dimension assignment and add TODO * fix mat_mul graph creating * fix MUL_MAT_256x16x10x1_256x1x10x1_16x1x10x1 * append type to graph cache key * wip * fix supported op * update comment * disable op other than add and mat_mul * add convert op to adapt multi input/output format * disable f16 for cpu backend according to official doc https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/cpu_backend.html#supported-operations * add supported data types flags in each backend * remove unused functions * append output type to graph key * fix gpu backend by disable the different data type op * fix cpu backend support ops * fix duplicated tensor name * append op name * suppress warning * remove unused code |
||
|---|---|---|
| .. | ||
| cmake | ||
| include | ||
| src | ||
| .gitignore | ||
| CMakeLists.txt | ||