Commit Graph

42 Commits

Author SHA1 Message Date
Jared Van Bortel 9c4dfd06e8 mention skipped change 2023-11-23 17:22:05 -05:00
Jared Van Bortel 6474fc879a vulkan : handle ggml_scale for n%8 != 0
ref ggerganov/llama.cpp#3754
2023-11-23 17:22:00 -05:00
Jared Van Bortel 39abedd1d7 vulkan : optimize workgroup sizes 2023-11-23 17:18:48 -05:00
Jared Van Bortel 84f7fc4553 vulkan : rope n_past is now KQ_pos, f16 rope kernel 2023-11-23 17:18:42 -05:00
Jared Van Bortel 71565eb0c3 vulkan : replace ggml_diag_mask_inf with ggml_add (custom -inf mask) 2023-11-23 17:18:27 -05:00
Jared Van Bortel c438c16896 fix build with external fmtlib (v10)
Co-authored-by: ToKiNoBug <tokinobug@163.com>
2023-11-08 16:31:29 -05:00
Jared Van Bortel a8cac53207 kompute : fix issues with debug layers 2023-11-08 16:31:29 -05:00
Adam Treat ffd0624be2 Remove this debug code. 2023-11-03 17:22:22 -04:00
Adam Treat e006d377dd Scale the workgroup count down to allow correct generation for falcon with
AMD radeon cards with lower workgroup count limit

Partially fixes #1581
2023-11-03 17:22:22 -04:00
Adam Treat 74ddf0f17d Fix synchronization problem for AMD Radeon with amdvlk driver or windows
drivers. Does not have any performance or fidelity effect on other gpu/driver
combos I've tested.

FIXES: https://github.com/nomic-ai/gpt4all/issues/1507
2023-11-03 17:22:22 -04:00
Adam Treat 8d9efbf97a Lower the workgroup count for some shaders by providing a loop that processes
four floats at a time.
2023-11-03 17:22:22 -04:00
Adam Treat 752f7ebd61 Remove unused push constant that was giving validation errors. 2023-11-03 17:22:22 -04:00
cebtenzzre cbc0d1af79 kompute : make scripts executable 2023-11-03 17:22:22 -04:00
cebtenzzre 21841d3163 kompute : enable kp_logger and make it static (#8) 2023-11-03 17:22:22 -04:00
Aaron Miller cc05a602d6 use mat*vec shaders for mat*mat
I wrote the mat*mat shaders from scratch so I understand them better but
they are currently not faster than just multiply-invoking the mat*vec
shaders, by a significant degree - so, except for f32 which needed a new
shader, revert to the m*v ones here.
2023-11-03 17:22:22 -04:00
Aaron Miller c1fd64548d attempted speedups 2 2023-11-03 17:22:22 -04:00
Aaron Miller 9bc52ebae3 attempted speedups 2023-11-03 17:22:22 -04:00
Aaron Miller cd0257ed0d q4_1 mat*mat 2023-11-03 17:22:22 -04:00
Aaron Miller 4809890d80 rm commented dbg print 2023-11-03 17:22:22 -04:00
Aaron Miller b78a94bc6d q6k mm works 2023-11-03 17:22:22 -04:00
Aaron Miller 3327d84a7f perf: use bigger threadgroups in mm 2023-11-03 17:22:22 -04:00
Aaron Miller 46385ee0d5 misc vulkan cleanup
make pushconts consistent w/ dispatch, avoid a double free
2023-11-03 17:22:22 -04:00
Aaron Miller f0cd38b9ad add mat*mat ops 2023-11-03 17:22:22 -04:00
Aaron Miller 020b1745a0 vulkan: implement neox mode for rope 2023-11-03 17:22:21 -04:00
Aaron Miller ff4212d20f q8 mat*vec 2023-11-03 17:22:21 -04:00
Aaron Miller 9db90cbe12 f16 mv broadcasting fix (gqa fix) 2023-11-03 17:22:21 -04:00
Adam Treat bc4b5ed1cb Fixes for subgroup size to bring AMD and NVIDIA inline with eachother for all kernels. 2023-11-03 17:22:21 -04:00
Adam Treat 32289aa447 Fixes for norm. 2023-11-03 17:22:21 -04:00
Adam Treat 06d4b21598 Fix offset into the qh and now we have working vulkan accelerated for gguff'd llama. 2023-11-03 17:22:21 -04:00
Adam Treat f1c9bc1821 Add q6_k getrows and mul*vec kernel. 2023-11-03 17:22:21 -04:00
Adam Treat 4b223ec432 Refactor getrows to use common code and get ready for q6_k. 2023-11-03 17:22:21 -04:00
Adam Treat 601905e75e Move the subgroups and printf into common. 2023-11-03 17:22:21 -04:00
Adam Treat 93306f16d0 Consolidate code for mat x vec kernels and use subgroups more extensively. 2023-11-03 17:22:21 -04:00
Adam Treat 77135a3bf5 Add a common boilerplate code via include and elim copy pasta 2023-11-03 17:22:21 -04:00
Cebtenzzre 6b6c73a9e3 kompute : don't fail build because of -Warray-bounds
There are some warnings in debug builds that are likely to be false
positives.
2023-11-03 17:22:21 -04:00
Adam Treat 2c24d67e7b Don't crash on available devices if we can't even create an instance. 2023-10-05 13:39:18 -04:00
Adam Treat bd5f6399bb Don't try and install kompute artifacts. 2023-10-05 13:39:18 -04:00
Aaron Miller beee57266f Make kompute actually include external SDK headers when requested 2023-10-05 13:39:18 -04:00
Adam Treat b7e2e691d4 Completely revamp how we do object management with the vulkan backend and
stop using so many static objects so we can tear down and bring up vulkan
on new devices in the same runtime.
2023-10-05 13:39:18 -04:00
Adam Treat 45c8778b49 Switch to a dynamic dispatch table instead of linking hard against libvulkan. 2023-10-05 13:39:18 -04:00
Aaron Miller 8563fa001f remove dynamic deps from kompute build
should no longer have new external deps other than libvulkan

```
ubuntu@ip-172-31-1-24:~/repo/gpt4all/gpt4all-backend/build$ ldd ./libllamamodel-mainline-avxonly.so
        linux-vdso.so.1 (0x00007ffcb53bb000)
        libvulkan.so.1 => /lib/x86_64-linux-gnu/libvulkan.so.1 (0x00007f239dab5000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f239d800000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f239d719000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f239da95000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f239d400000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f239dd1d000)
```
2023-10-05 13:39:18 -04:00
niansa ba15dfd0be Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0. 2023-10-05 13:39:18 -04:00