Commit Graph

84 Commits

Author SHA1 Message Date
nullname e36ad89528
bugfix: error pre-allocated tensor (k_cache_view-0) (#12)
* fix device binding at ggml_backend_qnn_buffer_type

* merge ggml_backend_qnn_buffer_context and qnn_mem_buffer

* wip

* add log

* wip

* add qnn_buffer_ptr

* remove tailing `\n` at log

* add log

* enable GGML_OP_NONE

* wip

* wip

* disable tensor with view

* wip

* wip

* more log for view tensor

* re-enable view

* wip

* remove link android lib

* set dimension at bind function

* move graph traversal to backend-ops

* wip

* add get_view_internal_dimension to obtain the tensor view source dimension

* use _view_source_dimensions to allocate qnn tensor

* add place holder function ggml_backend_qnn_cpy_tensor_async

* add ggml_qnn_aggregate_op_config

* make matmul based on ggml_qnn_aggregate_op_config

* wip

* manually specify the order of op destruct

* skip register qnn-cpu backend

* disable view op again

* remove _view_source_dimensions

* add nop for reshape and view ops

* add log

* add comment
2024-12-11 10:42:00 +08:00
hongruichen 0d02ee09ed fix int overflow and remove view op to pass unit test 2024-12-03 10:55:11 +08:00
hongruichen c5e6549331 fix: fix assertion 2024-11-29 23:38:06 +08:00
hongruichen 09efaa389e define compile flag as module private 2024-11-29 17:24:05 +08:00
hongruichen 6d4feae579 redo conflict changes 2024-11-29 17:14:01 +08:00
hongruichen 5103b166ba bugfix: block large tensor calc in npu 2024-11-29 14:19:34 +08:00
nullname a2df09b6af
[WIP] feat: perf opt (#10)
* reduce log

* wip

* add function to create concat nodes

* opt

* insert concat node before mulmat

* use resize op

* wip

* add bind_buffer and remov ggml prefix in tensor types

* use gather node instead

* fix tensor type, now succeed in gpu and cpu, failed in npu

* add comment

* wip

* add comment

* wip

* in destructor, clear internal buffer before unbind

* disable gather for npu

* wip

* count swap memory as free memory

* wip

* fix supported_types

ggml_backend_device_i.supports_op will be invoked before ggml_backend_device_i.init_backend

* rename create_tensors -> initialize_op_nodes

* move ggml_qnn_op_config to deparated file

* wip

* add create_convert_nodes

* add comment

* enable different type in/out for npu and cpu backend

* fix npu convert op

* enlarge max buffer size

* add more error code

* check tensor type before create convert node

* add log

* add log

* remove transpose0 and use buildin transpose flag

* rename transpose1 -> transpose_out

* disable convert for npu

* add more logs
2024-11-29 00:03:23 +08:00
nullname e6dbdacc32
feat: fix llama-bench (#7)
* remove unused functions

* wip

* init from last devices

* move init into constructor

* wip

* add static assert to device table

* make kDeviceCaps as constexpr

* get free memory and total memory

* add optimize flag for qnn backend
2024-11-13 17:06:46 +08:00
nullname 8ad86dc703
feat: add QNN_OP_TRANSPOSE (#6)
* redo: add convert nodes

This reverts commit 8448acd5ebf8fe86ab9d25313b64a15c811ef96e.

* align clang format with cann

* rename binary_op -> general_op

casue there're some op that will only tak 1 param

* Revert "rename binary_op -> general_op"

This reverts commit 5be63b1a0dc4614457785367dade62158fe46214.

* wip

* add GGML_OP_PERMUTE

* add GGML_OP_VIEW and GGML_OP_GET_ROWS

* wip

* Revert "wip"

This reverts commit 772462ca6cfa01ea31bde725c2da60076ad9385f.
2024-11-04 23:12:03 +08:00
nullname fe565cfd9f
fix compiling error in release 2024-10-29 15:47:07 +08:00
hongruichen 5c1e6d4905 disable gelu in NPU 2024-10-29 00:54:08 +08:00
nullname 4abaf7d87e
feat: fix mulmat (#2)
* ggml_qnn_op_config now manager the construction of ggml_qnn_tensor

* wip

* add interface ggml_qnn_op_config

* add ggml_qnn_list_op_config

* add create_tensor and move tensor bind to execute

* wip

* rename: ggml_qnn_list_op_config -> ggml_qnn_matmul_op_config

* add tensortype to allow native tensor

* remove ggml_tensor param at ggml_qnn_tensor::create_tensor

* postpone the tensor id allocation to add_node

* add ggml_qnn_op_config_base

* trival change to reduct the param of function

* split bind_tensors into bind_input_tensors and bind_output_tensors

* implement ggml_qnn_single_op_config::create_tensors

next will set the prameter of transpose

* tensor: add bind buffer

* add parameter tensor type

* implement add_tensor_param

* set qnn_instance only at constructor

* set transpose tensor param

* move create_op_constructor into op-config module

* create QNN_OP_MAT_MUL from ggml_qnn_matmul_op_config

* try fix crash

* fix compiling error at older ndk (r23c)

* fix crash

* fix parameter tensor name

* update tensor dimension assignment and add TODO

* fix mat_mul graph creating

* fix MUL_MAT_256x16x10x1_256x1x10x1_16x1x10x1

* append type to graph cache key

* wip

* fix supported op

* update comment

* disable op other than add and mat_mul

* add convert op to adapt multi input/output format

* disable f16 for cpu backend according to official doc

https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/cpu_backend.html#supported-operations

* add supported data types flags in each backend

* remove unused functions

* append output type to graph key

* fix gpu backend by disable the different data type op

* fix cpu backend support ops

* fix duplicated tensor name

* append op name

* suppress warning

* remove unused code
2024-10-28 12:48:16 +08:00
hongruichen 181cf52888 adapt new register backend interface and fix missing ops 2024-10-11 10:17:50 +08:00
hongruichen 1da8a3e678 fix compiling error after merge 2024-09-30 10:37:23 +08:00
Hongrui Chen a1ceaae4ad fix compiling error at older ndk (r23c) 2024-09-30 10:18:12 +08:00
hongruichen 481cb3a0c5 fix compiling error 2024-09-07 12:29:26 +08:00
みゃん dedadf2a20
Fixed a bug where debug code was included in the release, resulting i… (#1)
* Fixed a bug where debug code was included in the release, resulting in an undefined function error.

* Change the path of the QNN library when building in termux environment

* Revert "Change the path of the QNN library when building in termux environment"

This reverts commit c6e26a3679da2608940e2163e090adf75d667400.

* Changed so that GGML_QNN_DEFAULT_LIB_SEARCH_PATH can be set from command line arguments
2024-08-20 10:20:23 +08:00
hongruichen 47f6e02eda fix: try fix the tensor rank of mul mat 2024-07-31 23:54:07 +08:00
hongruichen 74eb05a13b feat: add ggml_qnn_op_config for handle different op 2024-07-31 20:22:37 +08:00
hongruichen 9a5f802bb6 refactoring: add convient macro to disable copy and move of class 2024-07-29 22:18:48 +08:00
hongruichen 6da82947df refactoring: set the default qnn lib search path at CMakeLists.txt by GGML_QNN_DEFAULT_LIB_SEARCH_PATH 2024-07-29 15:53:14 +08:00
hongruichen 1f9d2a7e22 refactoring: improve tensor print 2024-07-28 22:05:51 +08:00
hongruichen e33b5c9837 refactoring: print the name of unsupport op 2024-07-27 13:49:49 +08:00
hongruichen 8ab1f15fe3 refactoring: remove internal functions, use op table directly 2024-07-27 13:43:07 +08:00
hongruichen e0c9b34016 feat: check if dims equal for add
looks qnn add can only applied to matrix with equal dimensions
2024-07-27 13:38:12 +08:00
hongruichen 5da73f8085 refactoring: move forward and supports_op into ops file 2024-07-27 13:24:57 +08:00
hongruichen 867c91bfaf feat: add error string for QnnOpPackage_Error_t 2024-07-27 13:24:57 +08:00
hongruichen ccfec70106 refactoring: remove unused get_rpcmem_from_memhandle func 2024-07-27 13:24:57 +08:00
hongruichen 2c73791d62 refactoring: remove dup code 2024-07-27 10:48:09 +08:00
hongruichen 18aa6654d5 refactoring: opt graph key gen 2024-07-27 10:39:07 +08:00
hongruichen 47735cb589 fix: try fix error in 2nd run by appending dimension into graph key 2024-07-26 23:04:53 +08:00
hongruichen ee305cc171 refactoring: split qnn rpc buffer into dedicated class 2024-07-26 22:52:23 +08:00
hongruichen f843e5aaf5 fix: 1.free up rpc memory at destruct
2. unbind tesnsor
2024-07-25 23:45:04 +08:00
hongruichen 706793f078 fix: back to qnn tensor v1 to fix the create tensor error 2024-07-22 23:08:38 +08:00
hongruichen 3b47056c97 refactoring: change the tensor binding mode between qnn tensor and ggml tensor 2024-07-22 23:08:38 +08:00
hongruichen b173c4e061 feat: update tensor name when bind to graph 2024-07-20 17:31:40 +08:00
hongruichen 5f3b1ae3b0 fix: try fix graph cache with append the tensors name 2024-07-20 16:39:06 +08:00
hongruichen 51f95d6980 fix: dimension could be wrong for tensor liked 1x1x8 2024-07-20 16:11:35 +08:00
hongruichen 27299463ae fix: try fix tensor type error 2024-07-20 15:13:10 +08:00
hongruichen 28a00e5e6c fix: try fix QNN_GRAPH_ERROR_INVALID_OP_CONFIG 2024-07-20 14:11:58 +08:00
hongruichen 1679dcf47e fix: check all dimentions in `can offload` 2024-07-20 13:29:01 +08:00
hongruichen b1b5cc10b1 add function to convert qnn error into string 2024-07-19 22:51:17 +08:00
hongruichen a607995f95 Reapply "tried fix the add node error 6005"
This reverts commit f45fbec8f4.
2024-07-19 15:35:55 +08:00
hongruichen f45fbec8f4 Revert "tried fix the add node error 6005"
This reverts commit ce3d09e5f2.
2024-07-19 12:59:38 +08:00
hongruichen ce3d09e5f2 tried fix the add node error 6005 2024-07-19 12:59:21 +08:00
hongruichen 15f5cc450c bug: fix allocation size overflow at log 2024-07-18 19:44:05 +08:00
hongruichen d82b3a0bdb feat: add GGML_UNARY_OP_GELU 2024-07-18 11:15:48 +08:00
hongruichen ce199b2de7 refactoring: downgrade some log to debug level 2024-07-17 23:49:47 +08:00
hongruichen c76fc9aa2f fix warnings 2024-07-17 23:32:13 +08:00
hongruichen 6457a68bd7 disable qnn profiling in release build 2024-07-17 23:24:29 +08:00