hongruichen
1f9d2a7e22
refactoring: improve tensor print
2024-07-28 22:05:51 +08:00
hongruichen
e33b5c9837
refactoring: print the name of unsupport op
2024-07-27 13:49:49 +08:00
hongruichen
8ab1f15fe3
refactoring: remove internal functions, use op table directly
2024-07-27 13:43:07 +08:00
hongruichen
e0c9b34016
feat: check if dims equal for add
...
looks qnn add can only applied to matrix with equal dimensions
2024-07-27 13:38:12 +08:00
hongruichen
5da73f8085
refactoring: move forward and supports_op into ops file
2024-07-27 13:24:57 +08:00
hongruichen
867c91bfaf
feat: add error string for QnnOpPackage_Error_t
2024-07-27 13:24:57 +08:00
hongruichen
ccfec70106
refactoring: remove unused get_rpcmem_from_memhandle func
2024-07-27 13:24:57 +08:00
hongruichen
2c73791d62
refactoring: remove dup code
2024-07-27 10:48:09 +08:00
hongruichen
18aa6654d5
refactoring: opt graph key gen
2024-07-27 10:39:07 +08:00
hongruichen
be9a8c73a0
fix: suppress warning
2024-07-26 23:07:25 +08:00
hongruichen
47735cb589
fix: try fix error in 2nd run by appending dimension into graph key
2024-07-26 23:04:53 +08:00
hongruichen
ee305cc171
refactoring: split qnn rpc buffer into dedicated class
2024-07-26 22:52:23 +08:00
hongruichen
f843e5aaf5
fix: 1.free up rpc memory at destruct
...
2. unbind tesnsor
2024-07-25 23:45:04 +08:00
hongruichen
706793f078
fix: back to qnn tensor v1 to fix the create tensor error
2024-07-22 23:08:38 +08:00
hongruichen
3b47056c97
refactoring: change the tensor binding mode between qnn tensor and ggml tensor
2024-07-22 23:08:38 +08:00
hongruichen
b173c4e061
feat: update tensor name when bind to graph
2024-07-20 17:31:40 +08:00
hongruichen
5f3b1ae3b0
fix: try fix graph cache with append the tensors name
2024-07-20 16:39:06 +08:00
hongruichen
51f95d6980
fix: dimension could be wrong for tensor liked 1x1x8
2024-07-20 16:11:35 +08:00
hongruichen
27299463ae
fix: try fix tensor type error
2024-07-20 15:13:10 +08:00
hongruichen
28a00e5e6c
fix: try fix QNN_GRAPH_ERROR_INVALID_OP_CONFIG
2024-07-20 14:11:58 +08:00
hongruichen
1679dcf47e
fix: check all dimentions in `can offload`
2024-07-20 13:29:01 +08:00
hongruichen
b1b5cc10b1
add function to convert qnn error into string
2024-07-19 22:51:17 +08:00
hongruichen
a607995f95
Reapply "tried fix the add node error 6005"
...
This reverts commit f45fbec8f4 .
2024-07-19 15:35:55 +08:00
hongruichen
0153a23d3f
fix support ops
...
This reverts commit f45fbec8f4 .
2024-07-19 15:31:29 +08:00
hongruichen
f45fbec8f4
Revert "tried fix the add node error 6005"
...
This reverts commit ce3d09e5f2 .
2024-07-19 12:59:38 +08:00
hongruichen
ce3d09e5f2
tried fix the add node error 6005
2024-07-19 12:59:21 +08:00
hongruichen
665f823748
fix op checker
2024-07-18 22:26:53 +08:00
hongruichen
15f5cc450c
bug: fix allocation size overflow at log
2024-07-18 19:44:05 +08:00
hongruichen
d82b3a0bdb
feat: add GGML_UNARY_OP_GELU
2024-07-18 11:15:48 +08:00
hongruichen
ce199b2de7
refactoring: downgrade some log to debug level
2024-07-17 23:49:47 +08:00
hongruichen
c76fc9aa2f
fix warnings
2024-07-17 23:32:13 +08:00
hongruichen
6457a68bd7
disable qnn profiling in release build
2024-07-17 23:24:29 +08:00
hongruichen
2502b57203
fix warnings
2024-07-17 22:10:12 +08:00
hongruichen
454deef83c
register qnn backend
2024-07-17 21:25:55 +08:00
hongruichen
eed960575f
add build step of QNN backend at ggml
2024-07-17 19:43:01 +08:00
hongruichen
861bb9c580
Merge tag 'b3405' into dev-refactoring
2024-07-17 17:13:55 +08:00
hongruichen
bb13795dce
refactoring: remove unused functions and variables
2024-07-17 14:17:35 +08:00
hongruichen
63dc587dff
refactoring: make the buffer alloc and free stay in same class
2024-07-17 14:08:31 +08:00
hongruichen
b1ef302991
refactoring: remove depend of dlsym at utils.hpp
2024-07-17 12:21:33 +08:00
Johannes Gäßler
5e116e8dd5
make/cmake: add missing force MMQ/cuBLAS for HIP ( #8515 )
2024-07-16 21:20:59 +02:00
hongruichen
0301b500cd
refactoring: prevent leak the QNN_INTERFACE_VER_TYPE and QNN_SYSTEM_INTERFACE_VER_TYPE outside of qnn.hpp
2024-07-17 00:18:38 +08:00
Xuan Son Nguyen
97bdd26eee
Refactor lora adapter support ( #8332 )
...
* lora: load to devide buft
* add patch tensor function
* correct tensor patch
* llama_lora_adapter_apply
* correct ggml_backend_tensor_copy
* add llm_build_mm
* fix auto merge
* update based on review comments
* add convert script
* no more transpose A
* add f16 convert
* add metadata check
* add sanity check
* fix ftype
* add requirements
* fix requirements
* fix outfile
* conversion: only allow selected models
* fix types
* cuda : do not use dmmv if the tensor does not have enough cols
* llama : lora fixes
* do not disable mmap with lora
Co-authored-by: slaren <slarengh@gmail.com>
* llm_build_lora_mm_id
* convert_lora : MoE LoRA conversion support
* convert_lora : prefer safetensors, similarly to convert_hf
* convert_hf : simplify modify_tensors for InternLM2
* convert_lora : lazy conversion
* llama : load and use alpha from LoRA adapters
* llama : use llm_build_lora_mm in most model graphs
* auto scale
* Revert "auto scale"
This reverts commit 42415a4874 .
* remove redundant params
* Apply suggestions from code review
Co-authored-by: slaren <slarengh@gmail.com>
* change kv metadata
* move add_type to __init__
* convert_hf : move add_type to main()
* convert_lora : use the GGUFWriter from Model instead of overwriting it
---------
Co-authored-by: slaren <slarengh@gmail.com>
Co-authored-by: Francis Couture-Harpin <git@compilade.net>
2024-07-15 20:50:47 +02:00
Daniel Bevenius
8fac431b06
ggml : suppress unknown pragma 'GCC' on windows ( #8460 )
...
This commit adds a macro guard to pragma GCC to avoid the following
warning on windows:
```console
C:\llama.cpp\ggml\src\ggml-aarch64.c(17,9): warning C4068:
unknown pragma 'GCC' [C:\lama.cpp\build\ggml\src\ggml.vcxproj]
```
2024-07-15 15:48:17 +03:00
Meng, Hengyu
16bdfa42ac
[SYCL] add concat through dim 1/2 ( #8483 )
...
* add concat through dim 1/2
2024-07-15 19:32:15 +08:00
0cc4m
bda62d7999
Vulkan MMQ Fix ( #8479 )
...
* Fix incoherence by adding missing LOAD_VEC_A parameter
* Fix Vulkan op result checker build error
2024-07-15 09:38:52 +02:00
hongruichen
f32327e2b2
remove multiply declearation of log in unit test
2024-07-15 12:06:12 +08:00
hongruichen
30b40006cc
remove unused declarations
2024-07-14 23:50:11 +08:00
hongruichen
148ceab70c
add log op
2024-07-14 23:00:50 +08:00
bandoti
17eb6aa8a9
vulkan : cmake integration ( #8119 )
...
* Add Vulkan to CMake pkg
* Add Sycl to CMake pkg
* Add OpenMP to CMake pkg
* Split generated shader file into separate translation unit
* Add CMake target for Vulkan shaders
* Update README.md
* Add make target for Vulkan shaders
* Use pkg-config to locate vulkan library
* Add vulkan SDK dep to ubuntu-22-cmake-vulkan workflow
* Clean up tabs
* Move sudo to apt-key invocation
* Forward GGML_EXTRA_LIBS to CMake config pkg
* Update vulkan obj file paths
* Add shaderc to nix pkg
* Add python3 to Vulkan nix build
* Link against ggml in cmake pkg
* Remove Python dependency from Vulkan build
* code review changes
* Remove trailing newline
* Add cflags from pkg-config to fix w64devkit build
* Update README.md
* Remove trailing whitespace
* Update README.md
* Remove trailing whitespace
* Fix doc heading
* Make glslc required Vulkan component
* remove clblast from nix pkg
2024-07-13 18:12:39 +02:00
Georgi Gerganov
c917b67f06
metal : template-ify some of the kernels ( #8447 )
...
ggml-ci
2024-07-13 18:32:33 +03:00