Default Branch

4fd59e8427 · ggml-cuda: use CMAKE_CUDA_ARCHITECTURES if set when GGML_NATIVE=ON (#18413) · Updated 2025-12-27 17:33:14 -08:00

Branches

6ccbd1777a · wip · Updated 2024-01-24 05:45:04 -08:00    happyz

5630
18

da23b56f25 · wip : no ic 8 step · Updated 2024-01-24 03:25:34 -08:00    happyz

5630
18

06c2d0d117 · wip · Updated 2024-01-23 12:42:43 -08:00    happyz

5630
14

a9681febd6 · ggml : online attention (CPU) · Updated 2024-01-20 06:45:41 -08:00    happyz

5630
4

32a392fe68 · try a differerent fix · Updated 2024-01-19 14:10:23 -08:00    happyz

5631
2

4a3bc1522e · py : linting with mypy and isort · Updated 2024-01-19 12:18:58 -08:00    happyz

5632
3

1453215165 · kompute : fix ggml_add kernel · Updated 2024-01-18 14:09:16 -08:00    happyz

5748
105

ccc78a200e · hellaswag: speed up even more by parallelizing log-prob evaluation · Updated 2024-01-18 08:25:29 -08:00    happyz

5648
1

2917e6b528 · Merge branch 'master' into gg/imatrix-gpu-4931 · Updated 2024-01-17 08:43:45 -08:00    happyz

5655
10

23742deb5b · py : fix padded dummy tokens (I hope) · Updated 2024-01-17 05:44:22 -08:00    happyz

5674
4

9fd1e83f6d · Use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 · Updated 2024-01-17 02:16:08 -08:00    happyz

5660
1

49bafe0986 · tests : avoid creating RNGs for each tensor · Updated 2024-01-17 00:40:55 -08:00    happyz

5663
6

bb9abb5cd8 · imatrix: guard Q4_0/Q5_0 against ffn_down craziness · Updated 2024-01-15 23:56:05 -08:00    happyz

5677
2

9998ecd191 · llama : add phixtral support (wip) · Updated 2024-01-13 04:24:07 -08:00    happyz

5707
1

1fb563ebdc · py : try to fix flake stuff · Updated 2024-01-13 03:42:35 -08:00    happyz

5708
2

9bfcb16fd3 · Add llama enum for IQ2_XS · Updated 2024-01-11 08:24:12 -08:00    happyz

5757
11

24096933b0 · server : try to fix infill when prompt is empty · Updated 2024-01-09 01:27:29 -08:00    happyz

5759
1

7216af5c09 · ggml : fix 32-bit ARM compat (cont) · Updated 2024-01-09 00:33:16 -08:00    happyz

5762
2

d57cb9c294 · passkey : add readme · Updated 2024-01-08 01:13:44 -08:00    happyz

5772
7

7cfde78190 · llama : remove redundant GQA check · Updated 2024-01-06 06:04:20 -08:00    happyz

5780
1