Default Branch

07a0c4ba92 · Revert "ggml-cuda: use CMAKE_CUDA_ARCHITECTURES if set when GGML_NATIVE=ON (#18413)" (#18426) · Updated 2025-12-28 04:53:36 -08:00

Branches

3750706962 · llama : add llama_token_is_eog() · Updated 2024-04-20 06:52:03 -07:00

4868
4

f02ea667c1 · ggml : temporary disable llamafile sgemm until fixed · Updated 2024-04-16 12:45:56 -07:00

4877
1

eedd42e376 · KV Cache defrag hash overflow - TMP Fix by @slaren · Updated 2024-04-16 01:24:34 -07:00

4880
1

8b495540fa · imatrix : remove invalid assert · Updated 2024-04-12 01:45:12 -07:00    happyz

4905
1

072e0a4d3b · scipts : add LICENSE and gen-authors.sh to sync · Updated 2024-04-08 23:19:33 -07:00    happyz

4981
3

a37696d4f1 · speculative : more robust tokenizer comparison · Updated 2024-04-04 15:28:13 -07:00    happyz

4951
9

4c190ba676 · cuda : reduce registers · Updated 2024-03-28 12:17:08 -07:00    happyz

4993
77

64b7d85891 · llama : fix command-r inference · Updated 2024-03-28 03:22:24 -07:00    happyz

4998
1

6be02b5969 · cuda : fix build · Updated 2024-03-27 01:31:52 -07:00    happyz

5015
72

87a6088ffe · rename unicodedata.{cpp,h} to unicode-data.{cpp,h} · Updated 2024-03-26 07:52:33 -07:00    happyz

5030
7

9c5fd6be14 · minor : spacing · Updated 2024-03-26 05:09:02 -07:00    happyz

5028
2

6f20e2672f · Include IQ2_XXS and IQ2_XS in teet-quantize-fns · Updated 2024-03-25 10:01:20 -07:00    happyz

5032
1

210e469114 · cuda : fix LLAMA_CUDA_F16 build · Updated 2024-03-25 07:31:10 -07:00    happyz

5034
1

d05c13b3b9 · llama : fix BPE LF token on MSVC · Updated 2024-03-23 11:03:16 -07:00    happyz

5054
3

3a468e6f9f · llama : fix type of KQ_mask and KQ_pos · Updated 2024-03-22 08:12:17 -07:00    happyz

5059
68

0e826d12a5 · quantize: be able to specify the token embedding tensor type · Updated 2024-03-22 07:27:34 -07:00    happyz

5068
2

8c3d5b5a79 · common : remove defaults · Updated 2024-03-22 06:33:24 -07:00    happyz

5064
2

12aa74ba7d · minor : spacing · Updated 2024-03-22 06:24:57 -07:00    happyz

5289
6

072c56fcdb · metal : fix the fix · Updated 2024-03-22 00:58:22 -07:00    happyz

5069
3

a710d58d88 · Try fix quantized k-cache on ROCm · Updated 2024-03-21 11:18:50 -07:00    happyz

5074
1