llama.cpp

History

nullname a1ab67478f [feat] add more op (#35 ) * move op key generate function to kOpCaps * fix op desc print * try fix rms_norm * Revert "try fix rms_norm" This reverts commit 33b296098012909cb482fc29b52b28098dc971cd. * add quantization type support by converting them to float * enable quantization tensor for mulmat in gpu/npu * fix asan error * add log and assert * insert output convert operator after mulmat * add log * fix some error in running * disable permute again * add log * add error function * Revert "add error function" This reverts commit f92ff47798ac8053fb776c55efbb1a98469c7af1. * add log * more log * disable convert op in graph * wip * add f16 config for graph * set f16 precision for f16 graph * fix override data type * add comment * add config flag to enable quantize type * add log * more quantized type for cpu and gpu backend * enable all quant types for cpu and gpu backend * rename * wip * add log * remove unused functions * skip permute * remove get_qnn_op_input_param_count * fallback to generic_get_op_desc if no op_desc * revert 'skip permute' * Revert "revert 'skip permute'" This reverts commit 5761e31fd23c69c4cabf6fd9fac1a0d3e5a74968. * wip * add log * print qnn tensor type * add log * limit the max size of tensor * add log * fix tensor size limiter * small improve on tensor info printer * disable sqrt and div to pass test-backend-ops for 8 gen 2 * remove debug log in release build * add log * skip permute in src * wip * disable reshape * skip mul at decoder start * wip * add log * add qnn_scoped_timer * add perf tracker in graph * add cmake options GGML_QNN_ENABLE_PERFORMANCE_TRACKING * fix flag name * use milli-second * wip * fix comment string * add file for profiler * change qnn-cpu to GGML_BACKEND_DEVICE_TYPE_ACCEL, so that we can run tests on cpu * wip * profiler: refactoring * wip * add implement for print_profile_events * set-up profiler for graph * set profiler to graph execute * pretty print events * unified log print prefix * print event count * enable optrace * print duration at event end * wip * add more detailed soc information * wip * move device caps array into qnn-lib.cpp * remove lib_name in device_context * move get_graph_key_from_cgraph to graph.cpp * add override type for tensor key * use override_type instead of original data type for graph key * append op type to tensor name to fix error in qwen * remove todo * wip		2025-03-22 12:34:31 +08:00
..
CMakeLists.txt	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
backend-ops.cpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
backend-ops.hpp	feat: fix some TODO item in upstream PR #26 (#27 )	2025-02-27 23:16:08 +08:00
backend.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
buffer.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
convert.cpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
convert.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
dl-loader.hpp	feat: fix some TODO item in upstream PR #26 (#27 )	2025-02-27 23:16:08 +08:00
ggml-qnn.cpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
graph.cpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
graph.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
logger.cpp	[bugfix]make sure single node op will have the same type (#29 )	2025-02-28 19:18:16 +08:00
logger.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
op-config-base.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
op-config-caps.cpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
op-config-impl.cpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
op-config-impl.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
op-config.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
profiler.cpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
profiler.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
qnn-lib.cpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
qnn-lib.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
qnn-types.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
tensor.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
utils.cpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00
utils.hpp	[feat] add more op (#35 )	2025-03-22 12:34:31 +08:00