llama.cpp/ggml/src/ggml-qnn
nullname a1ab67478f
[feat] add more op (#35)
* move op key generate function to kOpCaps

* fix op desc print

* try fix rms_norm

* Revert "try fix rms_norm"

This reverts commit 33b296098012909cb482fc29b52b28098dc971cd.

* add quantization type support by converting them to float

* enable quantization tensor for mulmat in gpu/npu

* fix asan error

* add log and assert

* insert output convert operator after mulmat

* add log

* fix some error in running

* disable permute again

* add log

* add error function

* Revert "add error function"

This reverts commit f92ff47798ac8053fb776c55efbb1a98469c7af1.

* add log

* more log

* disable convert op in graph

* wip

* add f16 config for graph

* set f16 precision for f16 graph

* fix override data type

* add comment

* add config flag to enable quantize type

* add log

* more quantized type for cpu and gpu backend

* enable all quant types for cpu and gpu backend

* rename

* wip

* add log

* remove unused functions

* skip permute

* remove get_qnn_op_input_param_count

* fallback to generic_get_op_desc if no op_desc

* revert 'skip permute'

* Revert "revert 'skip permute'"

This reverts commit 5761e31fd23c69c4cabf6fd9fac1a0d3e5a74968.

* wip

* add log

* print qnn tensor type

* add log

* limit the max size of tensor

* add log

* fix tensor size limiter

* small improve on tensor info printer

* disable sqrt and div to pass test-backend-ops for 8 gen 2

* remove debug log in release build

* add log

* skip permute in src

* wip

* disable reshape

* skip mul at decoder start

* wip

* add log

* add qnn_scoped_timer

* add perf tracker in graph

* add cmake options GGML_QNN_ENABLE_PERFORMANCE_TRACKING

* fix flag name

* use milli-second

* wip

* fix comment string

* add file for profiler

* change qnn-cpu to GGML_BACKEND_DEVICE_TYPE_ACCEL, so that we can run tests on cpu

* wip

* profiler: refactoring

* wip

* add implement for print_profile_events

* set-up profiler for graph

* set profiler to graph execute

* pretty print events

* unified log print prefix

* print event count

* enable optrace

* print duration at event end

* wip

* add more detailed soc information

* wip

* move device caps array into qnn-lib.cpp

* remove lib_name in device_context

* move get_graph_key_from_cgraph to graph.cpp

* add override type for tensor key

* use override_type instead of original data type for graph key

* append op type to tensor name to fix error in qwen

* remove todo

* wip
2025-03-22 12:34:31 +08:00
..
CMakeLists.txt [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
backend-ops.cpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
backend-ops.hpp feat: fix some TODO item in upstream PR #26 (#27) 2025-02-27 23:16:08 +08:00
backend.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
buffer.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
convert.cpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
convert.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
dl-loader.hpp feat: fix some TODO item in upstream PR #26 (#27) 2025-02-27 23:16:08 +08:00
ggml-qnn.cpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
graph.cpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
graph.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
logger.cpp [bugfix]make sure single node op will have the same type (#29) 2025-02-28 19:18:16 +08:00
logger.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
op-config-base.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
op-config-caps.cpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
op-config-impl.cpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
op-config-impl.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
op-config.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
profiler.cpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
profiler.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
qnn-lib.cpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
qnn-lib.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
qnn-types.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
tensor.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
utils.cpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00
utils.hpp [feat] add more op (#35) 2025-03-22 12:34:31 +08:00