llama.cpp/.github/workflows
Reese Levine 7ca5991d2b
ggml webgpu: add support for emscripten builds (#17184)
* Faster tensors (#8)

Add fast matrix and matrix/vector multiplication.

* Use map for shader replacements instead of pair of strings

* Wasm (#9)

* webgpu : fix build on emscripten

* more debugging stuff

* test-backend-ops: force single thread on wasm

* fix single-thread case for init_tensor_uniform

* use jspi

* add pthread

* test: remember to set n_thread for cpu backend

* Add buffer label and enable dawn-specific toggles to turn off some checks

* Intermediate state

* Fast working f16/f32 vec4

* Working float fast mul mat

* Clean up naming of mul_mat to match logical model, start work on q mul_mat

* Setup for subgroup matrix mat mul

* Basic working subgroup matrix

* Working subgroup matrix tiling

* Handle weirder sg matrix sizes (but still % sg matrix size)

* Working start to gemv

* working f16 accumulation with shared memory staging

* Print out available subgroup matrix configurations

* Vectorize dst stores for sg matrix shader

* Gemv working scalar

* Minor set_rows optimization (#4)

* updated optimization, fixed errors

* non vectorized version now dispatches one thread per element

* Simplify

* Change logic for set_rows pipelines

---------

Co-authored-by: Neha Abbas <nehaabbas@macbookpro.lan>
Co-authored-by: Neha Abbas <nehaabbas@ReeseLevines-MacBook-Pro.local>
Co-authored-by: Reese Levine <reeselevine1@gmail.com>

* Comment on dawn toggles

* Working subgroup matrix code for (semi)generic sizes

* Remove some comments

* Cleanup code

* Update dawn version and move to portable subgroup size

* Try to fix new dawn release

* Update subgroup size comment

* Only check for subgroup matrix configs if they are supported

* Add toggles for subgroup matrix/f16 support on nvidia+vulkan

* Make row/col naming consistent

* Refactor shared memory loading

* Move sg matrix stores to correct file

* Working q4_0

* Formatting

* Work with emscripten builds

* Fix test-backend-ops emscripten for f16/quantized types

* Use emscripten memory64 to support get_memory

* Add build flags and try ci

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>

* Remove extra whitespace

* Move wasm single-thread logic out of test-backend-ops for cpu backend

* Disable multiple threads for emscripten single-thread builds in ggml_graph_plan

* Fix .gitignore

* Add memory64 option and remove unneeded macros for setting threads to 1

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-12-03 10:25:34 +01:00
..
bench.yml.disabled llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
build-cache.yml ci : refactor sdk caching to minimize storage (#16414) 2025-10-06 17:40:21 +02:00
build-cmake-pkg.yml ci: add workflow for relocatable cmake package (#14346) 2025-06-23 15:30:51 -03:00
build-linux-cross.yml ci : disable failing riscv cross build (#16952) 2025-11-02 23:11:21 +01:00
build.yml ggml webgpu: add support for emscripten builds (#17184) 2025-12-03 10:25:34 +01:00
check-vendor.yml ci: add check vendor job (#17179) 2025-11-12 14:56:02 +01:00
close-issue.yml ci : exempt correct research label (#15825) 2025-09-06 01:21:15 +02:00
copilot-setup-steps.yml ci : add copilot-instructions.md (#15286) 2025-08-21 11:47:52 +02:00
docker.yml ci : enable free-disk-space on cuda docker build (#16877) 2025-10-31 00:34:27 +01:00
editorconfig.yml ci : pin dependency to specific version (#11137) 2025-01-08 12:07:20 +01:00
gguf-publish.yml ci : update checkout, setup-python and upload-artifact to latest (#6456) 2024-04-03 21:01:13 +03:00
labeler.yml repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
pre-tokenizer-hashes.yml ci : check that pre-tokenizer hashes are up-to-date (#15032) 2025-08-02 14:39:01 +02:00
python-check-requirements.yml py : fix requirements check '==' -> '~=' (#8982) 2024-08-12 11:02:01 +03:00
python-lint.yml ci : add ubuntu cuda build, build with one arch on windows (#10456) 2024-11-26 13:05:07 +01:00
python-type-check.yml ci : reduce severity of unused Pyright ignore comments (#9697) 2024-09-30 14:13:16 -04:00
release.yml ci : move release details to the top visible by default (#17719) 2025-12-03 09:22:46 +01:00
server.yml ci : switch to BoringSSL on Server workflow (#17441) 2025-11-22 21:38:19 +01:00
update-ops-docs.yml ci : avoid manual updates of docs/ops.md (#16663) 2025-10-19 14:03:25 +02:00
winget.yml ci : skip winget update when not in ggml-org (#17465) 2025-12-02 10:15:01 +01:00