Johannes Gäßler
4dc3d10e80
Remove shfl and AllReduce from backend interface
2026-02-11 14:51:37 +01:00
Johannes Gäßler
a0d9dd20ee
ggml: backend-agnostic tensor parallelism
2026-02-11 14:12:33 +01:00
Georgi Gerganov
365a3e8c31
ggml : add ggml_build_forward_select ( #18550 )
...
* ggml : add ggml_build_forward_select
* cuda : adapt CUDA graph compat to new feature
* vulkan : update logic to handle command buffer closing
* ggml : check compute for fusion
* ggml : add comment
2026-01-19 20:03:19 +02:00
Perry Naseck
657a2e644b
cmake : update blas logic ( #18205 )
2026-01-10 18:00:54 +02:00
Jeff Bolz
c0b45097c3
rename optimize_graph to graph_optimize ( #16082 )
2025-09-18 13:46:17 -05:00
Jeff Bolz
e68aa10d8f
vulkan: sort graph to allow more parallel execution ( #15850 )
...
* vulkan: sort graph to allow more parallel execution
Add a backend proc to allow the backend to modify the graph. The
vulkan implementation looks at which nodes depend on each other
and greedily reorders them to group together nodes that don't
depend on each other. It only reorders the nodes, doesn't change
the contents of any of them.
With #15489 , this reduces the number of synchronizations needed.
* call optimize_graph per-split
2025-09-09 02:10:07 +08:00
AN Long
cd6983d56d
ggml : fix field name when new ggml_backend ( #14944 )
2025-08-08 14:37:22 +02:00
Diego Devesa
5931c1f233
ggml : add support for dynamic loading of backends ( #10469 )
...
* ggml : add support for dynamic loading of backends
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-11-25 15:13:39 +01:00
Diego Devesa
ae8de6d50a
ggml : build backends as libraries ( #10256 )
...
* ggml : build backends as libraries
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com>
2024-11-14 18:04:35 +01:00