hongruichen
|
332514cd5c
|
qnn fix: update device capabilities for quantized types in qnn-lib to improve compatibility
|
2025-06-23 16:04:01 +08:00 |
nullname
|
c23ab465c0
|
feat: perf opt part4 (#43)
* wip
* refactor: rewrite dequantize_row_q4_0 by intrinsic
* log for debug
* fix q4 intrinsic
* small opt
* wip
* wip
* add vtcm_quota_size
* add perf log for hexagon-npu backend
* wip
* add log
* sync after a specfic op
* increase worker thread priority
* fix unbalanced thread slice
* small slict to fit in vtcm cache
* limit the supported row element size
* opt 4_0 dequant
* fix q4 dequant
* add power_utils
* add rms_norm
* wip
* enable rms_norm f32
* fix rms_norm with param
* fix compiling flags
* use float
* fix small row size
* vectorized rms norm
* wip
* read 2 vectors
* rename
* add perf log on update
* set empty tensors handle also
* merge some rpc functions
* opt param update
* wip
* print more log
* add struct for update param config
* add npu_device_graph_set_tensor_with_param
* merge tensor and params update
* wip
* wip
* make as template to reuse
* vectorize dequantize_row_q8_0
* opt
* avoid using union to store q data
* wip
* wip
* wip
|
2025-05-28 00:00:42 +08:00 |
hongruichen
|
02af8ff653
|
fix qnn only build flag
|
2025-05-08 21:28:11 +08:00 |
nullname
|
beff5c4b78
|
feat: op perf opt (#38)
* add op define xml
* copy qnn libs in cmake
* fix htp skel path
* add windows copy file list
* wip
* add generated package
* remove unused params
* add cmake list
* set qnn sdk and hexagon sdk path
* wip
* wip
* fix tools version
* fix compiling error
* fix dims calc
* wip
* add mulmat 2d
* wip
* reduction
* wip
* wip
* fix compiling error in x64
* wip
* fix device description in emulator
* wip
* add flag
* copy necessary libs
* wip
* load HtpPrepare first for emulator
* enable custom op for 2d matrix
* verify op config before add to node
* Revert "verify op config before add to node"
This reverts commit 206dec826e560625e053c4c78e023994f993526e.
* wip
* wip
* wip
* revert tool version change
* use hexagon sdk version 5.5.0
https://docs.qualcomm.com/bundle/publicresource/topics/80-77512-2/release-notes-wrapper.html?product=1601111740010422#5.5.0
* wip
* move to sub dir
* add hexagon npu device and server lib
* fix npu lib build
* refactoring: rename QNNBackend enum
* fix compiling error
* wip
* remove qnn/backend.hpp
* add hexagon dsp host layer
* extract rpc_mem from qnn submodule
* fix dsp compiling error
* wip
* wip
* open and lose npu device
* split objects into separated files
* fix linking error
* add npu_tensor
* add host graph
* map rpc buffer before usage
* fix some todos
* add shared module
* split rpc_interface from rpc_mem
* get get_dsp_arch from device
* wip
* rename host classes
* fix hexagon sdk arch getter
* fix device open
* fix linking error
* fix crash
* use tensor_data_type
* fix npu lib crash
* fix debug log print
* skip empty graph
* wip
* add log
* fix unmap fail
* fix tensor set
* remove some logs
* flush back memory after finished
* fix nb
* wip
* wip
* add helper function
* impl add op
* fix some add in test-backend-ops
* add elt wise sub and mul
* fix crash on some inplace op
* wip
* fix elt wise op calc
* wip
* split mul_mat into file
* add caps array
* wip
* wip
* print support/unsupport op
* copy lldb-server for newer android sdk
* add tensor_spec
* add assert
* fix crash when loading model
* rename cmake option
* fix name
* fix device memory and description
* fix compiling error on qnn only build
* fix some potential UBs
* fix comments
|
2025-04-21 12:06:16 +08:00 |