Commit Graph

21 Commits

Author SHA1 Message Date
Aaron Teo adbfbf9086
ggml-blas: refactor backend
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-20 15:11:45 +08:00
Aaron Teo 2ee4d5fe2f
ggml-blas: fix graph realloc
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-20 14:03:24 +08:00
Aaron Teo 623e7135c2
ggml-blas: fix memleak
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-20 13:37:26 +08:00
Aaron Teo 04ed19bbc0
ggml-blas: further cleanup
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 23:37:56 +08:00
Aaron Teo 10ce5e056d
ggml-blas: more code formatting
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 23:20:28 +08:00
Aaron Teo 75e506ff22
ggml-blas: clean up code
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 23:19:22 +08:00
Aaron Teo 7998d08b29
ggml-blas: bring back openmp
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 23:07:54 +08:00
Aaron Teo e481be6da6
ggml-blas: move global blas n threads to set_n_threads
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 22:19:12 +08:00
Aaron Teo 6dff031caa
ggml-blas: force dequant routine to use max logical cores
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 21:57:09 +08:00
Aaron Teo 447057973c
ggml-blas: fix ne
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 19:27:42 +08:00
Aaron Teo 717531b1a7
ggml-blas: add note
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 19:22:14 +08:00
Aaron Teo aae6d1e9b0
ggml-blas: fix invalid data access
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 19:15:34 +08:00
Aaron Teo 9a14a094ac
ggml: rewrite ggml-blas
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-14 18:06:31 +08:00
Aaron Teo 1926e07e1a
ggml-blas: code clean up
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-11 21:27:13 +08:00
Aaron Teo 19c8ec9964
ggml-blas: fully working mmid
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-11 21:10:25 +08:00
Aaron Teo f682374613
ggml-blas: initial mmid impl
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-12-11 20:51:02 +08:00
Jeff Bolz c0b45097c3
rename optimize_graph to graph_optimize (#16082) 2025-09-18 13:46:17 -05:00
Jeff Bolz e68aa10d8f
vulkan: sort graph to allow more parallel execution (#15850)
* vulkan: sort graph to allow more parallel execution

Add a backend proc to allow the backend to modify the graph. The
vulkan implementation looks at which nodes depend on each other
and greedily reorders them to group together nodes that don't
depend on each other. It only reorders the nodes, doesn't change
the contents of any of them.

With #15489, this reduces the number of synchronizations needed.

* call optimize_graph per-split
2025-09-09 02:10:07 +08:00
AN Long cd6983d56d
ggml : fix field name when new ggml_backend (#14944) 2025-08-08 14:37:22 +02:00
Diego Devesa 5931c1f233
ggml : add support for dynamic loading of backends (#10469)
* ggml : add support for dynamic loading of backends

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-11-25 15:13:39 +01:00
Diego Devesa ae8de6d50a
ggml : build backends as libraries (#10256)
* ggml : build backends as libraries

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com>
2024-11-14 18:04:35 +01:00
Renamed from ggml/src/ggml-blas.cpp (Browse further)