Aaron Teo
adbfbf9086
ggml-blas: refactor backend
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-20 15:11:45 +08:00
Aaron Teo
2ee4d5fe2f
ggml-blas: fix graph realloc
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-20 14:03:24 +08:00
Aaron Teo
623e7135c2
ggml-blas: fix memleak
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-20 13:37:26 +08:00
Aaron Teo
04ed19bbc0
ggml-blas: further cleanup
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 23:37:56 +08:00
Aaron Teo
10ce5e056d
ggml-blas: more code formatting
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 23:20:28 +08:00
Aaron Teo
75e506ff22
ggml-blas: clean up code
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 23:19:22 +08:00
Aaron Teo
7998d08b29
ggml-blas: bring back openmp
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 23:07:54 +08:00
Aaron Teo
e481be6da6
ggml-blas: move global blas n threads to set_n_threads
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 22:19:12 +08:00
Aaron Teo
6dff031caa
ggml-blas: force dequant routine to use max logical cores
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 21:57:09 +08:00
Aaron Teo
447057973c
ggml-blas: fix ne
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 19:27:42 +08:00
Aaron Teo
717531b1a7
ggml-blas: add note
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 19:22:14 +08:00
Aaron Teo
aae6d1e9b0
ggml-blas: fix invalid data access
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 19:15:34 +08:00
Aaron Teo
9a14a094ac
ggml: rewrite ggml-blas
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 18:06:31 +08:00
Aaron Teo
1926e07e1a
ggml-blas: code clean up
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-11 21:27:13 +08:00
Aaron Teo
19c8ec9964
ggml-blas: fully working mmid
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-11 21:10:25 +08:00
Aaron Teo
f682374613
ggml-blas: initial mmid impl
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-11 20:51:02 +08:00
Jeff Bolz
c0b45097c3
rename optimize_graph to graph_optimize ( #16082 )
2025-09-18 13:46:17 -05:00
Jeff Bolz
e68aa10d8f
vulkan: sort graph to allow more parallel execution ( #15850 )
...
* vulkan: sort graph to allow more parallel execution
Add a backend proc to allow the backend to modify the graph. The
vulkan implementation looks at which nodes depend on each other
and greedily reorders them to group together nodes that don't
depend on each other. It only reorders the nodes, doesn't change
the contents of any of them.
With #15489 , this reduces the number of synchronizations needed.
* call optimize_graph per-split
2025-09-09 02:10:07 +08:00
AN Long
cd6983d56d
ggml : fix field name when new ggml_backend ( #14944 )
2025-08-08 14:37:22 +02:00
Diego Devesa
5931c1f233
ggml : add support for dynamic loading of backends ( #10469 )
...
* ggml : add support for dynamic loading of backends
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-11-25 15:13:39 +01:00
Diego Devesa
ae8de6d50a
ggml : build backends as libraries ( #10256 )
...
* ggml : build backends as libraries
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com>
2024-11-14 18:04:35 +01:00