Commit Graph

12 Commits

Author SHA1 Message Date
nullname a2df09b6af
[WIP] feat: perf opt (#10)
* reduce log

* wip

* add function to create concat nodes

* opt

* insert concat node before mulmat

* use resize op

* wip

* add bind_buffer and remov ggml prefix in tensor types

* use gather node instead

* fix tensor type, now succeed in gpu and cpu, failed in npu

* add comment

* wip

* add comment

* wip

* in destructor, clear internal buffer before unbind

* disable gather for npu

* wip

* count swap memory as free memory

* wip

* fix supported_types

ggml_backend_device_i.supports_op will be invoked before ggml_backend_device_i.init_backend

* rename create_tensors -> initialize_op_nodes

* move ggml_qnn_op_config to deparated file

* wip

* add create_convert_nodes

* add comment

* enable different type in/out for npu and cpu backend

* fix npu convert op

* enlarge max buffer size

* add more error code

* check tensor type before create convert node

* add log

* add log

* remove transpose0 and use buildin transpose flag

* rename transpose1 -> transpose_out

* disable convert for npu

* add more logs
2024-11-29 00:03:23 +08:00
nullname e6dbdacc32
feat: fix llama-bench (#7)
* remove unused functions

* wip

* init from last devices

* move init into constructor

* wip

* add static assert to device table

* make kDeviceCaps as constexpr

* get free memory and total memory

* add optimize flag for qnn backend
2024-11-13 17:06:46 +08:00
nullname 4abaf7d87e
feat: fix mulmat (#2)
* ggml_qnn_op_config now manager the construction of ggml_qnn_tensor

* wip

* add interface ggml_qnn_op_config

* add ggml_qnn_list_op_config

* add create_tensor and move tensor bind to execute

* wip

* rename: ggml_qnn_list_op_config -> ggml_qnn_matmul_op_config

* add tensortype to allow native tensor

* remove ggml_tensor param at ggml_qnn_tensor::create_tensor

* postpone the tensor id allocation to add_node

* add ggml_qnn_op_config_base

* trival change to reduct the param of function

* split bind_tensors into bind_input_tensors and bind_output_tensors

* implement ggml_qnn_single_op_config::create_tensors

next will set the prameter of transpose

* tensor: add bind buffer

* add parameter tensor type

* implement add_tensor_param

* set qnn_instance only at constructor

* set transpose tensor param

* move create_op_constructor into op-config module

* create QNN_OP_MAT_MUL from ggml_qnn_matmul_op_config

* try fix crash

* fix compiling error at older ndk (r23c)

* fix crash

* fix parameter tensor name

* update tensor dimension assignment and add TODO

* fix mat_mul graph creating

* fix MUL_MAT_256x16x10x1_256x1x10x1_16x1x10x1

* append type to graph cache key

* wip

* fix supported op

* update comment

* disable op other than add and mat_mul

* add convert op to adapt multi input/output format

* disable f16 for cpu backend according to official doc

https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/cpu_backend.html#supported-operations

* add supported data types flags in each backend

* remove unused functions

* append output type to graph key

* fix gpu backend by disable the different data type op

* fix cpu backend support ops

* fix duplicated tensor name

* append op name

* suppress warning

* remove unused code
2024-10-28 12:48:16 +08:00
hongruichen 181cf52888 adapt new register backend interface and fix missing ops 2024-10-11 10:17:50 +08:00
hongruichen 3b47056c97 refactoring: change the tensor binding mode between qnn tensor and ggml tensor 2024-07-22 23:08:38 +08:00
hongruichen 0301b500cd refactoring: prevent leak the QNN_INTERFACE_VER_TYPE and QNN_SYSTEM_INTERFACE_VER_TYPE outside of qnn.hpp 2024-07-17 00:18:38 +08:00
hongruichen 100ccd5e7f add unary op template and more ops 2024-07-13 00:55:34 +08:00
Hongrui Chen 5f2e3918f6 refactoring ggml_qnn_tensor 2024-07-09 19:58:46 +08:00
hongruichen 13dc3a02c3 use qnn graph inside add and mul ops 2024-07-05 13:27:16 +08:00
hongruichen 4b2ee61f62 move graph map to backend object 2024-07-05 11:58:47 +08:00
hongruichen 000240cf62 add clang format file and reformating 2024-07-04 23:29:31 +08:00
hongruichen 8b677d1b2f move qnn backend into sub folder 2024-07-02 19:42:14 +08:00
Renamed from ggml-qnn/backend.hpp (Browse further)