Default Branch

58062860af · ggml : use WARP_SIZE/2 for argmax reduction offset (#18092) · Updated 2025-12-16 19:47:01 -08:00

Branches

652d303b32 · metal : fuse add + rms · Updated 2025-09-18 06:29:25 -07:00

934
1

64c6dcbe6d · metal : make the NSG a function constant in mul_mv kernels · Updated 2025-09-18 01:31:59 -07:00

939
2

6045c5a263 · cont : put all buffers in the same virtual address space · Updated 2025-09-14 05:46:57 -07:00

975
2

3f62ee8bee · metal : back to a single queue per device · Updated 2025-09-09 07:06:46 -07:00

1017
9

3f62ee8bee · metal : back to a single queue per device · Updated 2025-09-09 07:06:46 -07:00

1017
9

7b717fb4b2 · Rewrite llama-run to use llama-server · Updated 2025-09-05 09:22:36 -07:00

1054
1

9f2636b7dc · wip · Updated 2025-09-01 01:17:56 -07:00

1103
1

d8c17629ac · examples : add compare-mlx · Updated 2025-08-31 23:10:01 -07:00

1106
1

4317d5abf5 · wip · Updated 2025-08-28 03:55:21 -07:00

1137
1

dc2187d48d · ggml : fix SSM_SCAN for n_groups > 1 · Updated 2025-08-27 14:37:04 -07:00

1142
1

7a152de3bb · vulkan: enable Conv2D for Apple after MoltenVK fixed the bug · Updated 2025-08-23 06:57:15 -07:00

1187
1

fb573f4440 · ggml-quants : avoid division by zero in make_q3_quants · Updated 2025-08-17 15:26:02 -07:00

1257
2

220860aa0c · graph : use F32 accumulators for gpt-oss · Updated 2025-08-14 06:08:31 -07:00

1284
1

d9b625edb6 · ggml-quants : handle imatrix for MXFP4 · Updated 2025-08-11 19:12:10 -07:00

1310
1

2763dc8b53 · ggml-quants : handle zero amax for MXFP4 · Updated 2025-08-06 13:26:25 -07:00

1348
2

ea5e55d03e · Merge branch 'master' into compilade/imatrix-neutral-prior · Updated 2025-08-05 10:34:40 -07:00

1350
4

2ec70c964b · tests: Fix OPT_STEP_SGD test-backend-ops · Updated 2025-08-04 21:57:14 -07:00

1356
4

145401c9e3 · context : fix logits size overflow for huge batches · Updated 2025-08-04 19:26:46 -07:00

1355
2

342e7014db · imatrix : only warn about suffix when output format is unspecified · Updated 2025-08-04 12:12:27 -07:00

1360
2

e549515cb3 · memory : handle kv_unified for hybrid models · Updated 2025-08-02 21:45:47 -07:00

1369
1