llama.cpp

Commit Graph

Author	SHA1	Message	Date
hongruichen	1f9d2a7e22	refactoring: improve tensor print	2024-07-28 22:05:51 +08:00
hongruichen	e33b5c9837	refactoring: print the name of unsupport op	2024-07-27 13:49:49 +08:00
hongruichen	8ab1f15fe3	refactoring: remove internal functions, use op table directly	2024-07-27 13:43:07 +08:00
hongruichen	e0c9b34016	feat: check if dims equal for add looks qnn add can only applied to matrix with equal dimensions	2024-07-27 13:38:12 +08:00
hongruichen	5da73f8085	refactoring: move forward and supports_op into ops file	2024-07-27 13:24:57 +08:00
hongruichen	867c91bfaf	feat: add error string for QnnOpPackage_Error_t	2024-07-27 13:24:57 +08:00
hongruichen	ccfec70106	refactoring: remove unused get_rpcmem_from_memhandle func	2024-07-27 13:24:57 +08:00
hongruichen	2c73791d62	refactoring: remove dup code	2024-07-27 10:48:09 +08:00
hongruichen	18aa6654d5	refactoring: opt graph key gen	2024-07-27 10:39:07 +08:00
hongruichen	be9a8c73a0	fix: suppress warning	2024-07-26 23:07:25 +08:00
hongruichen	47735cb589	fix: try fix error in 2nd run by appending dimension into graph key	2024-07-26 23:04:53 +08:00
hongruichen	ee305cc171	refactoring: split qnn rpc buffer into dedicated class	2024-07-26 22:52:23 +08:00
hongruichen	f843e5aaf5	fix: 1.free up rpc memory at destruct 2. unbind tesnsor	2024-07-25 23:45:04 +08:00
hongruichen	706793f078	fix: back to qnn tensor v1 to fix the create tensor error	2024-07-22 23:08:38 +08:00
hongruichen	3b47056c97	refactoring: change the tensor binding mode between qnn tensor and ggml tensor	2024-07-22 23:08:38 +08:00
hongruichen	b173c4e061	feat: update tensor name when bind to graph	2024-07-20 17:31:40 +08:00
hongruichen	5f3b1ae3b0	fix: try fix graph cache with append the tensors name	2024-07-20 16:39:06 +08:00
hongruichen	51f95d6980	fix: dimension could be wrong for tensor liked 1x1x8	2024-07-20 16:11:35 +08:00
hongruichen	27299463ae	fix: try fix tensor type error	2024-07-20 15:13:10 +08:00
hongruichen	28a00e5e6c	fix: try fix QNN_GRAPH_ERROR_INVALID_OP_CONFIG	2024-07-20 14:11:58 +08:00
hongruichen	1679dcf47e	fix: check all dimentions in `can offload`	2024-07-20 13:29:01 +08:00
hongruichen	b1b5cc10b1	add function to convert qnn error into string	2024-07-19 22:51:17 +08:00
hongruichen	a607995f95	Reapply "tried fix the add node error 6005" This reverts commit `f45fbec8f4`.	2024-07-19 15:35:55 +08:00
hongruichen	0153a23d3f	fix support ops This reverts commit `f45fbec8f4`.	2024-07-19 15:31:29 +08:00
hongruichen	f45fbec8f4	Revert "tried fix the add node error 6005" This reverts commit `ce3d09e5f2`.	2024-07-19 12:59:38 +08:00
hongruichen	ce3d09e5f2	tried fix the add node error 6005	2024-07-19 12:59:21 +08:00
hongruichen	665f823748	fix op checker	2024-07-18 22:26:53 +08:00
hongruichen	15f5cc450c	bug: fix allocation size overflow at log	2024-07-18 19:44:05 +08:00
hongruichen	d82b3a0bdb	feat: add GGML_UNARY_OP_GELU	2024-07-18 11:15:48 +08:00
hongruichen	ce199b2de7	refactoring: downgrade some log to debug level	2024-07-17 23:49:47 +08:00
hongruichen	c76fc9aa2f	fix warnings	2024-07-17 23:32:13 +08:00
hongruichen	6457a68bd7	disable qnn profiling in release build	2024-07-17 23:24:29 +08:00
hongruichen	2502b57203	fix warnings	2024-07-17 22:10:12 +08:00
hongruichen	454deef83c	register qnn backend	2024-07-17 21:25:55 +08:00
hongruichen	eed960575f	add build step of QNN backend at ggml	2024-07-17 19:43:01 +08:00
hongruichen	861bb9c580	Merge tag 'b3405' into dev-refactoring	2024-07-17 17:13:55 +08:00
hongruichen	bb13795dce	refactoring: remove unused functions and variables	2024-07-17 14:17:35 +08:00
hongruichen	63dc587dff	refactoring: make the buffer alloc and free stay in same class	2024-07-17 14:08:31 +08:00
hongruichen	b1ef302991	refactoring: remove depend of dlsym at utils.hpp	2024-07-17 12:21:33 +08:00
Johannes Gäßler	5e116e8dd5	make/cmake: add missing force MMQ/cuBLAS for HIP (#8515 )	2024-07-16 21:20:59 +02:00
hongruichen	0301b500cd	refactoring: prevent leak the QNN_INTERFACE_VER_TYPE and QNN_SYSTEM_INTERFACE_VER_TYPE outside of qnn.hpp	2024-07-17 00:18:38 +08:00
Xuan Son Nguyen	97bdd26eee	Refactor lora adapter support (#8332 ) * lora: load to devide buft * add patch tensor function * correct tensor patch * llama_lora_adapter_apply * correct ggml_backend_tensor_copy * add llm_build_mm * fix auto merge * update based on review comments * add convert script * no more transpose A * add f16 convert * add metadata check * add sanity check * fix ftype * add requirements * fix requirements * fix outfile * conversion: only allow selected models * fix types * cuda : do not use dmmv if the tensor does not have enough cols * llama : lora fixes * do not disable mmap with lora Co-authored-by: slaren <slarengh@gmail.com> * llm_build_lora_mm_id * convert_lora : MoE LoRA conversion support * convert_lora : prefer safetensors, similarly to convert_hf * convert_hf : simplify modify_tensors for InternLM2 * convert_lora : lazy conversion * llama : load and use alpha from LoRA adapters * llama : use llm_build_lora_mm in most model graphs * auto scale * Revert "auto scale" This reverts commit `42415a4874`. * remove redundant params * Apply suggestions from code review Co-authored-by: slaren <slarengh@gmail.com> * change kv metadata * move add_type to __init__ * convert_hf : move add_type to main() * convert_lora : use the GGUFWriter from Model instead of overwriting it --------- Co-authored-by: slaren <slarengh@gmail.com> Co-authored-by: Francis Couture-Harpin <git@compilade.net>	2024-07-15 20:50:47 +02:00
Daniel Bevenius	8fac431b06	ggml : suppress unknown pragma 'GCC' on windows (#8460 ) This commit adds a macro guard to pragma GCC to avoid the following warning on windows: ```console C:\llama.cpp\ggml\src\ggml-aarch64.c(17,9): warning C4068: unknown pragma 'GCC' [C:\lama.cpp\build\ggml\src\ggml.vcxproj] ```	2024-07-15 15:48:17 +03:00
Meng, Hengyu	16bdfa42ac	[SYCL] add concat through dim 1/2 (#8483 ) * add concat through dim 1/2	2024-07-15 19:32:15 +08:00
0cc4m	bda62d7999	Vulkan MMQ Fix (#8479 ) * Fix incoherence by adding missing LOAD_VEC_A parameter * Fix Vulkan op result checker build error	2024-07-15 09:38:52 +02:00
hongruichen	f32327e2b2	remove multiply declearation of log in unit test	2024-07-15 12:06:12 +08:00
hongruichen	30b40006cc	remove unused declarations	2024-07-14 23:50:11 +08:00
hongruichen	148ceab70c	add log op	2024-07-14 23:00:50 +08:00
bandoti	17eb6aa8a9	vulkan : cmake integration (#8119 ) * Add Vulkan to CMake pkg * Add Sycl to CMake pkg * Add OpenMP to CMake pkg * Split generated shader file into separate translation unit * Add CMake target for Vulkan shaders * Update README.md * Add make target for Vulkan shaders * Use pkg-config to locate vulkan library * Add vulkan SDK dep to ubuntu-22-cmake-vulkan workflow * Clean up tabs * Move sudo to apt-key invocation * Forward GGML_EXTRA_LIBS to CMake config pkg * Update vulkan obj file paths * Add shaderc to nix pkg * Add python3 to Vulkan nix build * Link against ggml in cmake pkg * Remove Python dependency from Vulkan build * code review changes * Remove trailing newline * Add cflags from pkg-config to fix w64devkit build * Update README.md * Remove trailing whitespace * Update README.md * Remove trailing whitespace * Fix doc heading * Make glslc required Vulkan component * remove clblast from nix pkg	2024-07-13 18:12:39 +02:00
Georgi Gerganov	c917b67f06	metal : template-ify some of the kernels (#8447 ) ggml-ci	2024-07-13 18:32:33 +03:00

1 2 3

110 Commits