Commit Graph

27 Commits

Author SHA1 Message Date
Adam Treat 8d9efbf97a Lower the workgroup count for some shaders by providing a loop that processes
four floats at a time.
2023-11-03 17:22:22 -04:00
Adam Treat 8400015337 Don't try an allocation on a heap that is smaller than the size we require. 2023-11-03 17:22:22 -04:00
Aaron Miller cc05a602d6 use mat*vec shaders for mat*mat
I wrote the mat*mat shaders from scratch so I understand them better but
they are currently not faster than just multiply-invoking the mat*vec
shaders, by a significant degree - so, except for f32 which needed a new
shader, revert to the m*v ones here.
2023-11-03 17:22:22 -04:00
Aaron Miller c1fd64548d attempted speedups 2 2023-11-03 17:22:22 -04:00
Aaron Miller 9bc52ebae3 attempted speedups 2023-11-03 17:22:22 -04:00
Aaron Miller cd0257ed0d q4_1 mat*mat 2023-11-03 17:22:22 -04:00
Aaron Miller b78a94bc6d q6k mm works 2023-11-03 17:22:22 -04:00
Aaron Miller d5741c07a5 use op param epsilon for norms 2023-11-03 17:22:22 -04:00
Aaron Miller 3327d84a7f perf: use bigger threadgroups in mm 2023-11-03 17:22:22 -04:00
Aaron Miller 46385ee0d5 misc vulkan cleanup
make pushconts consistent w/ dispatch, avoid a double free
2023-11-03 17:22:22 -04:00
Aaron Miller f0cd38b9ad add mat*mat ops 2023-11-03 17:22:22 -04:00
Aaron Miller ff4212d20f q8 mat*vec 2023-11-03 17:22:21 -04:00
Aaron Miller 9db90cbe12 f16 mv broadcasting fix (gqa fix) 2023-11-03 17:22:21 -04:00
Adam Treat bc4b5ed1cb Fixes for subgroup size to bring AMD and NVIDIA inline with eachother for all kernels. 2023-11-03 17:22:21 -04:00
Adam Treat de589ced7c Change this back to be in agreement with metal and our previous softmax kernel. 2023-11-03 17:22:21 -04:00
Adam Treat f1c9bc1821 Add q6_k getrows and mul*vec kernel. 2023-11-03 17:22:21 -04:00
Adam Treat 5509f74318 Minor cleanup. 2023-11-03 17:22:21 -04:00
Adam Treat 93306f16d0 Consolidate code for mat x vec kernels and use subgroups more extensively. 2023-11-03 17:22:21 -04:00
Adam Treat 2c24d67e7b Don't crash on available devices if we can't even create an instance. 2023-10-05 13:39:18 -04:00
Adam Treat addac25293 Set the singleton to nullptr here. 2023-10-05 13:39:18 -04:00
Adam Treat 68aca6be08 Only use vulkan with known quant that work. 2023-10-05 13:39:18 -04:00
Aaron Miller 8bea719879 vulkan: disambiguate gpus with the same name 2023-10-05 13:39:18 -04:00
Adam Treat 68cf1df6fb Throw an exception when allocation fails for vulkan. 2023-10-05 13:39:18 -04:00
Adam Treat b7e2e691d4 Completely revamp how we do object management with the vulkan backend and
stop using so many static objects so we can tear down and bring up vulkan
on new devices in the same runtime.
2023-10-05 13:39:18 -04:00
Adam Treat 45c8778b49 Switch to a dynamic dispatch table instead of linking hard against libvulkan. 2023-10-05 13:39:18 -04:00
Adam Treat 48a45ea435 Remove warning which fails on windows. 2023-10-05 13:39:18 -04:00
niansa ba15dfd0be Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0. 2023-10-05 13:39:18 -04:00