Default Branch

07a0c4ba92 · Revert "ggml-cuda: use CMAKE_CUDA_ARCHITECTURES if set when GGML_NATIVE=ON (#18413)" (#18426) · Updated 2025-12-28 04:53:36 -08:00

Branches

608f449880 · swift : fix build · Updated 2024-02-23 09:02:09 -08:00    happyz

5313
4

56c047156a · py : minor fixes · Updated 2024-02-22 09:22:56 -08:00    happyz

5322
1

5271c75666 · llama : fix K-shift with quantized K (wip) · Updated 2024-02-21 15:28:42 -08:00    happyz

5330
1

f249c997a8 · llama : adapt to F16 KQ_pos · Updated 2024-02-19 03:31:02 -08:00    happyz

5368
62

412735ec70 · Merge branch 'master' into gg/metal-batched · Updated 2024-02-19 01:25:24 -08:00    happyz

5368
6

47c662b0de · fix some spaces added by IDE in math op · Updated 2024-02-18 12:40:35 -08:00    happyz

5378
4

974e3cadff · ggml : try another fix · Updated 2024-02-17 08:14:35 -08:00    happyz

5397
2

e856bfed3b · hf : add support for --repo and --file · Updated 2024-02-15 05:05:15 -08:00    happyz

5411
3

ccd757a174 · convert : fix mistakes from refactoring · Updated 2024-02-13 09:01:30 -08:00    happyz

5419
4

5c977221d2 · iq1_s: slightly faster dot product · Updated 2024-02-13 05:18:27 -08:00    happyz

5425
15

4246b71ad7 · Fix compiler warnings (shadow variable) · Updated 2024-02-12 22:44:56 -08:00    happyz

5428
1

7286b83d3f · BERT WIP · Updated 2024-02-06 14:10:11 -08:00    happyz

5481
1

adcf16fd68 · py : fix empty bytes arg · Updated 2024-02-05 09:53:07 -08:00    happyz

5491
2

91c453fb11 · One cannot possibly be defining static_assert in a C++ compilation · Updated 2024-02-05 03:22:14 -08:00    happyz

5496
2

49a483e0f2 · wip · Updated 2024-02-04 02:34:36 -08:00    happyz

5522
60

a647257b47 · cuda : express strides with helper constants · Updated 2024-02-04 01:45:26 -08:00    happyz

5522
60

b957b8f5f6 · cuda : add flash_attn kernel (wip) · Updated 2024-02-01 09:49:57 -08:00    happyz

5526
39

ac26f27028 · cuda : increase C to 128 for better performance · Updated 2024-02-01 07:08:29 -08:00    happyz

5526
61

1ad42b1f1e · ggml : ggml_soft_max uses F16 mask · Updated 2024-01-31 10:33:59 -08:00    happyz

5526
36

719a087138 · iq3_xxs: forgotten update of the grid points · Updated 2024-01-30 08:39:07 -08:00    happyz

5540
1