llama.cpp

Commit Graph

Author	SHA1	Message	Date
hongruichen	5c1e6d4905	disable gelu in NPU	2024-10-29 00:54:08 +08:00
nullname	4abaf7d87e	feat: fix mulmat (#2 ) * ggml_qnn_op_config now manager the construction of ggml_qnn_tensor * wip * add interface ggml_qnn_op_config * add ggml_qnn_list_op_config * add create_tensor and move tensor bind to execute * wip * rename: ggml_qnn_list_op_config -> ggml_qnn_matmul_op_config * add tensortype to allow native tensor * remove ggml_tensor param at ggml_qnn_tensor::create_tensor * postpone the tensor id allocation to add_node * add ggml_qnn_op_config_base * trival change to reduct the param of function * split bind_tensors into bind_input_tensors and bind_output_tensors * implement ggml_qnn_single_op_config::create_tensors next will set the prameter of transpose * tensor: add bind buffer * add parameter tensor type * implement add_tensor_param * set qnn_instance only at constructor * set transpose tensor param * move create_op_constructor into op-config module * create QNN_OP_MAT_MUL from ggml_qnn_matmul_op_config * try fix crash * fix compiling error at older ndk (r23c) * fix crash * fix parameter tensor name * update tensor dimension assignment and add TODO * fix mat_mul graph creating * fix MUL_MAT_256x16x10x1_256x1x10x1_16x1x10x1 * append type to graph cache key * wip * fix supported op * update comment * disable op other than add and mat_mul * add convert op to adapt multi input/output format * disable f16 for cpu backend according to official doc https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/cpu_backend.html#supported-operations * add supported data types flags in each backend * remove unused functions * append output type to graph key * fix gpu backend by disable the different data type op * fix cpu backend support ops * fix duplicated tensor name * append op name * suppress warning * remove unused code	2024-10-28 12:48:16 +08:00
hongruichen	181cf52888	adapt new register backend interface and fix missing ops	2024-10-11 10:17:50 +08:00
hongruichen	1da8a3e678	fix compiling error after merge	2024-09-30 10:37:23 +08:00
hongruichen	481cb3a0c5	fix compiling error	2024-09-07 12:29:26 +08:00
みゃん	dedadf2a20	Fixed a bug where debug code was included in the release, resulting i… (#1 ) * Fixed a bug where debug code was included in the release, resulting in an undefined function error. * Change the path of the QNN library when building in termux environment * Revert "Change the path of the QNN library when building in termux environment" This reverts commit c6e26a3679da2608940e2163e090adf75d667400. * Changed so that GGML_QNN_DEFAULT_LIB_SEARCH_PATH can be set from command line arguments	2024-08-20 10:20:23 +08:00
hongruichen	47f6e02eda	fix: try fix the tensor rank of mul mat	2024-07-31 23:54:07 +08:00
hongruichen	74eb05a13b	feat: add ggml_qnn_op_config for handle different op	2024-07-31 20:22:37 +08:00
hongruichen	1f9d2a7e22	refactoring: improve tensor print	2024-07-28 22:05:51 +08:00
hongruichen	e33b5c9837	refactoring: print the name of unsupport op	2024-07-27 13:49:49 +08:00
hongruichen	8ab1f15fe3	refactoring: remove internal functions, use op table directly	2024-07-27 13:43:07 +08:00
hongruichen	e0c9b34016	feat: check if dims equal for add looks qnn add can only applied to matrix with equal dimensions	2024-07-27 13:38:12 +08:00
hongruichen	5da73f8085	refactoring: move forward and supports_op into ops file	2024-07-27 13:24:57 +08:00
hongruichen	18aa6654d5	refactoring: opt graph key gen	2024-07-27 10:39:07 +08:00
hongruichen	47735cb589	fix: try fix error in 2nd run by appending dimension into graph key	2024-07-26 23:04:53 +08:00
hongruichen	3b47056c97	refactoring: change the tensor binding mode between qnn tensor and ggml tensor	2024-07-22 23:08:38 +08:00
hongruichen	b173c4e061	feat: update tensor name when bind to graph	2024-07-20 17:31:40 +08:00
hongruichen	5f3b1ae3b0	fix: try fix graph cache with append the tensors name	2024-07-20 16:39:06 +08:00
hongruichen	27299463ae	fix: try fix tensor type error	2024-07-20 15:13:10 +08:00
hongruichen	d82b3a0bdb	feat: add GGML_UNARY_OP_GELU	2024-07-18 11:15:48 +08:00
hongruichen	0301b500cd	refactoring: prevent leak the QNN_INTERFACE_VER_TYPE and QNN_SYSTEM_INTERFACE_VER_TYPE outside of qnn.hpp	2024-07-17 00:18:38 +08:00
hongruichen	148ceab70c	add log op	2024-07-14 23:00:50 +08:00
hongruichen	100ccd5e7f	add unary op template and more ops	2024-07-13 00:55:34 +08:00
hongruichen	e3aa43adbd	suppress warning	2024-07-12 23:26:11 +08:00
hongruichen	f0894d897a	wip wip	2024-07-12 19:57:34 +08:00
hongruichen	be3aa9631f	use template function directly	2024-07-11 11:18:06 +08:00
hongruichen	8932135fdb	add sqrt and mul ops	2024-07-11 00:08:08 +08:00
hongruichen	7ea28a6fac	add helper function for binary op	2024-07-10 23:39:03 +08:00
hongruichen	b6f29273f0	add function to get graph from cache	2024-07-10 23:08:32 +08:00
hongruichen	80051cfc4d	remove unused variables	2024-07-10 19:57:47 +08:00
hongruichen	e97d3a6c48	fix tensor buffer allocation add log commit qnn buffer after changed add log register_rpc_mem 2 times update input tensors before graph finalize default to QNN_TENSORMEMTYPE_RAW set new tensors at execute move write input tensors to exec check if mem registered before actual do register rpc mem once allocated	2024-07-10 19:32:39 +08:00
Hongrui Chen	5f2e3918f6	refactoring ggml_qnn_tensor	2024-07-09 19:58:46 +08:00
hongruichen	13dc3a02c3	use qnn graph inside add and mul ops	2024-07-05 13:27:16 +08:00
hongruichen	4b2ee61f62	move graph map to backend object	2024-07-05 11:58:47 +08:00
hongruichen	ca0d999c2a	add ggml_qnn_graph	2024-07-05 11:35:18 +08:00
hongruichen	000240cf62	add clang format file and reformating	2024-07-04 23:29:31 +08:00
hongruichen	38f88d5fb1	fix compiling error after merge latest master	2024-07-03 00:13:53 +08:00
hongruichen	8b677d1b2f	move qnn backend into sub folder	2024-07-02 19:42:14 +08:00

38 Commits