Rail Chabdarov
340807273b
hip: Avoid compiler bug in RDNA code generation during debug builds on Windows ( #20655 )
2026-03-19 19:14:08 +01:00
uvos
b49d8b8757
ci : add hip quality check ( #20430 )
...
* CI: add hip quality check
* Update scripts/hip/gcn-cdna-vgpr-check.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update .github/workflows/hip-quality-check.yml
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update .github/workflows/hip-quality-check.yml
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update .github/workflows/hip-quality-check.yml
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update scripts/hip/gcn-cdna-vgpr-check.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update scripts/hip/gcn-cdna-vgpr-check.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update scripts/hip/gcn-cdna-vgpr-check.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update scripts/hip/gcn-cdna-vgpr-check.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Revert "Update .github/workflows/hip-quality-check.yml"
This reverts commit efa0bfcdb01dfac0feee674987a0482d50f46145.
* scripts: gcn-cdna-vgpr-check.py: enforce int type for total_vgprs
* scripts: gcn-cdna-vgpr-check.py: add flash attention instances to ignore list
* Bump ccache version
* Add mssing seperators to list
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-03-19 17:05:44 +01:00
uvos
d63aa398de
hip: compile debug builds with -O2 on hip to avoid a compiler bug ( #20392 )
2026-03-12 10:37:10 +08:00
yulo
f3dd7b8e68
HIP: add mmf for CDNA ( #18896 )
...
* refactor mmf rows_per_block
* speed up compile
* pass cdna compile
* fix cuda error
* clean up mmf
* f32 mmf
* clean float mma
* fix mmf error
* faster mmf
* extend tile k
* fix compile error
* Revert "extend tile k"
This reverts commit 4d2ef3d483 .
* fix smem overflow
* speed up compiling mmf
* speed up compile for hip
* 512 block for cdna
* config pad size
* fix as comment
* update select logic
* move some code to cuh
* fix as comment
* correct cdna3 config
---------
Co-authored-by: zhang hui <you@example.com>
2026-01-29 11:10:53 +01:00
Johannes Gäßler
80d28f104c
HIP: fix AMDGPU_TARGETS, update documentation ( #16803 )
2025-10-27 21:39:49 +01:00
Johannes Gäßler
ee09828cb0
HIP: fix GPU_TARGETS ( #16642 )
2025-10-18 14:47:32 +02:00
Johannes Gäßler
11f0af5504
CUDA: faster tile FA, add oob checks, more HSs ( #16492 )
2025-10-11 20:54:32 +02:00
uvos
e95fec640f
HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCWMMA 2.0.0 ( #16221 )
...
* HIP: Disable ROCWMMA fatt on CDNA when compiled against ROCWMMA 2.0.0
rocwmma 2.0.0 includes a bug in the code fakeing fp16 accumulation on CDNA
* CUDA: Fix volta condition in ggml_cuda_should_use_wmma_fattn
2025-10-01 23:09:25 +02:00
uvos
29c8fbe4e0
HIP: bump requirement to rocm 6.1 ( #15296 )
2025-08-13 20:44:30 +02:00
uvos
7ad67ba9fe
HIP: add cmake option to enable compiler output of kernel resource usage metrics ( #15103 )
2025-08-07 16:44:14 +02:00
uvos
b77d11179d
HIP: add GGML_HIP_MMQ_MFMA option to allow disableing the MFMA path. ( #14930 )
...
This is useful for testing for regressions on GCN with CDNA hardware.
With GGML_HIP_MMQ_MFMA=Off and GGML_CUDA_FORCE_MMQ=On we can conveniently test the GCN code path on CDNA. As CDNA is just GCN renamed with MFMA added and limited use ACC registers, this provides a good alternative for regression testing when GCN hardware is not available.
2025-07-29 17:44:30 +02:00
uvos
7d6d91babf
HIP: disable rocwmma on gfx12 by default until rocm 7.0 ( #14202 )
2025-06-16 13:47:38 +02:00
David Huang
84778e9770
CUDA/HIP: Share the same unified memory allocation logic. ( #12934 )
...
Replace compile-time `GGML_HIP_UMA` with environment variable `GGML_CUDA_ENABLE_UNIFIED_MEMORY`. This unifies the usage on NVIDIA and AMD GPUs, and allows a single binary to be shared between integrated and dedicated GPUs.
2025-04-15 11:20:38 +02:00
David Huang
becade5de7
HIP: implement FlashAttention via rocWMMA for CDNA and RDNA3+ ( #12032 )
...
Adds GGML_HIP_ROCWMMA_FATTN and rocwmma header check
Adds rocWMMA support to fattn-wmma-f16
---
Signed-off-by: Carl Klemm <carl@uvos.xyz>
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
Co-authored-by: Ben Jackson <ben@ben.com>
2025-03-03 22:10:54 +01:00
Johannes Gäßler
a28e0d5eb1
CUDA: app option to compile without FlashAttention ( #12025 )
2025-02-22 20:44:34 +01:00
fxzjshm
3ec9fd4b77
HIP: force max threads per block to be 1024 ( #11621 )
...
Some old/vendor forked version of llvm still use 256. Explicitly set it to 1024 to align with upstream llvm.
Signed-off-by: fxzjshm <fxzjshm@163.com>
2025-02-04 19:18:38 +01:00
Johannes Gäßler
864a0b67a6
CUDA: use mma PTX instructions for FlashAttention ( #11583 )
...
* CUDA: use mma PTX instructions for FlashAttention
* __shfl_sync workaround for movmatrix
* add __shfl_sync to HIP
Co-authored-by: Diego Devesa <slarengh@gmail.com>
2025-02-02 19:31:09 +01:00
uvos
27d135c970
HIP: require at least HIP 5.5
2025-01-30 16:25:44 +01:00
uvos
26771a1491
Hip: disable VMM on hip as it seams that it dosent work in some configurations ( #11420 )
2025-01-25 21:01:12 +01:00
uvos
5f0db9522f
hip : Add hipGraph and VMM support to ROCM ( #11362 )
...
* Add hipGraph support
* Enable VMM on rocm
2025-01-25 00:02:23 +01:00
Radoslav Gerganov
1244cdcf14
ggml : do not define GGML_USE_CUDA when building with GGML_BACKEND_DL ( #11211 )
...
Build fails when using HIP and GGML_BACKEND_DL:
```
/usr/bin/ld: ../ggml/src/libggml.so: undefined reference to `ggml_backend_cuda_reg'
collect2: error: ld returned 1 exit status
```
This patch fixes this.
2025-01-13 13:31:41 +02:00
Diego Devesa
5931c1f233
ggml : add support for dynamic loading of backends ( #10469 )
...
* ggml : add support for dynamic loading of backends
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-11-25 15:13:39 +01:00
Johannes Gäßler
c3ea58aca4
CUDA: remove DMMV, consolidate F16 mult mat vec ( #10318 )
2024-11-17 09:09:55 +01:00
Diego Devesa
ae8de6d50a
ggml : build backends as libraries ( #10256 )
...
* ggml : build backends as libraries
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com>
2024-11-14 18:04:35 +01:00