HappyZ

happyz synced commits to refs/pull/6822/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:52 -07:00

cc78c2a5fd Merge 476d319fde into e00b4a8f81

e00b4a8f81 Fix more int overflow during quant (PPL/CUDA). (#6563)

7bb36ccf91 gguf : enforce that tensor names are unique (#6905)

Compare 3 commits »

happyz synced commits to refs/pull/6828/head at happyz/llama.cpp from mirror 2024-04-29 10:18:52 -07:00

9684f4c421 revise hashing function

1d516d39d1 refactor

fdc0e47088 refactor llama_ngram_cache_update

dd1b905a8f Server: enable lookup decoding

7bb36ccf91 gguf : enforce that tensor names are unique (#6905)

Compare 34 commits »

happyz synced commits to refs/pull/6829/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:52 -07:00

948e08ecdb Merge 3d207c76e9133a19655833df29ae76241ed88194 into b8a7a5a90f

b8a7a5a90f build(cmake): simplify instructions (`cmake -B build && cmake --build build ...`) (#6964)

d2c898f746 ci : tmp disable gguf-split (#6983)

544f1f10ad ggml : fix __MSC_VER -> _MSC_VER (#6977)

ffe666572f llava-cli : multiple images (#6969)

Compare 16 commits »

happyz synced commits to refs/pull/6829/head at happyz/llama.cpp from mirror 2024-04-29 10:18:52 -07:00

3d207c76e9 add CI workflows

eaf154325d set TCP_NODELAY

c3ed6edb48 ggml : add RPC backend

Compare 3 commits »

happyz synced commits to refs/pull/6766/head at happyz/llama.cpp from mirror 2024-04-29 10:18:51 -07:00

9c5786161d Merge branch 'ggerganov:master' into ag_cuda_graphs

ca7f29f568 ci : add building in MSYS2 environments (Windows) (#6967)

c4f708a93f llama : fix typo LAMMAFILE -> LLAMAFILE (#6974)

e00b4a8f81 Fix more int overflow during quant (PPL/CUDA). (#6563)

7bb36ccf91 gguf : enforce that tensor names are unique (#6905)

Compare 60 commits »

happyz synced commits to refs/pull/6810/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:51 -07:00

b871e8eb41 Merge eb9a1ff63d into e00b4a8f81

e00b4a8f81 Fix more int overflow during quant (PPL/CUDA). (#6563)

7bb36ccf91 gguf : enforce that tensor names are unique (#6905)

ce023f6f2f add device version in device list (#6959)

Compare 4 commits »

happyz synced commits to refs/pull/6811/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:51 -07:00

c16f8c12ac Merge c70037f2b3 into b8a7a5a90f

b8a7a5a90f build(cmake): simplify instructions (`cmake -B build && cmake --build build ...`) (#6964)

d2c898f746 ci : tmp disable gguf-split (#6983)

544f1f10ad ggml : fix __MSC_VER -> _MSC_VER (#6977)

ffe666572f llava-cli : multiple images (#6969)

Compare 13 commits »

happyz synced commits to refs/pull/6784/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:51 -07:00

d2bbd524de Merge 2b2fd541c2 into e00b4a8f81

e00b4a8f81 Fix more int overflow during quant (PPL/CUDA). (#6563)

7bb36ccf91 gguf : enforce that tensor names are unique (#6905)

ce023f6f2f add device version in device list (#6959)

6e472f58e4 flake.lock: Update

Compare 11 commits »

happyz synced commits to refs/pull/6778/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:51 -07:00

e25f0dd9fa Merge ff5d21e608 into e00b4a8f81

e00b4a8f81 Fix more int overflow during quant (PPL/CUDA). (#6563)

7bb36ccf91 gguf : enforce that tensor names are unique (#6905)

ce023f6f2f add device version in device list (#6959)

6e472f58e4 flake.lock: Update

Compare 5 commits »

happyz synced commits to refs/pull/6766/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:51 -07:00

85a4f8830f Merge 9c5786161d into b8a7a5a90f

b8a7a5a90f build(cmake): simplify instructions (`cmake -B build && cmake --build build ...`) (#6964)

d2c898f746 ci : tmp disable gguf-split (#6983)

544f1f10ad ggml : fix __MSC_VER -> _MSC_VER (#6977)

ffe666572f llava-cli : multiple images (#6969)

Compare 14 commits »

happyz synced commits to refs/pull/6640/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:50 -07:00

430360261f Merge 3c02508aad into ffe666572f

ffe666572f llava-cli : multiple images (#6969)

24affa7db3 readme : update hot topics

f4ab2a4147 llama : fix BPE pre-tokenization (#6920)

3f167476b1 sampling : use std::random_device{}() for default random seed (#6962)

Compare 10 commits »

happyz synced commits to refs/pull/6602/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:50 -07:00

4ff8677cff Merge 1cd0a03720 into e00b4a8f81

e00b4a8f81 Fix more int overflow during quant (PPL/CUDA). (#6563)

7bb36ccf91 gguf : enforce that tensor names are unique (#6905)

ce023f6f2f add device version in device list (#6959)

6e472f58e4 flake.lock: Update

Compare 5 commits »

happyz synced commits to refs/pull/6563/head at happyz/llama.cpp from mirror 2024-04-29 10:18:50 -07:00

0258f9bd3d Revert back to int64_t.

91c10ef225 Fix some more int overflow in softmax.

Compare 2 commits »

happyz synced commits to refs/pull/6644/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:50 -07:00

63e10f5cce Merge 5d7a4595bb9f491bc4dd486f95828cdfb16982b6 into ffe666572f

ffe666572f llava-cli : multiple images (#6969)

24affa7db3 readme : update hot topics

f4ab2a4147 llama : fix BPE pre-tokenization (#6920)

3f167476b1 sampling : use std::random_device{}() for default random seed (#6962)

Compare 10 commits »

happyz synced commits to refs/pull/6511/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:50 -07:00

c8aec99030 Merge 1356c5ca1e5b7598376adcbdb6963ce5b32fa8c2 into b8a7a5a90f

1356c5ca1e *.py: fix flake8 warnings

3a544b35b6 convert-hf-to-gguf.py: print() --> logger

e04a4948b7 convert-hf-to-gguf.py: add additional logging

3980bcc23d constants.py: logger no longer required

Compare 45 commits »

happyz synced commits to refs/pull/6408/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:49 -07:00

ce426a6ed6 Merge 839cc90e37cd4a1348464fa36697fd6bf406d4a0 into d2c898f746

d2c898f746 ci : tmp disable gguf-split (#6983)

544f1f10ad ggml : fix __MSC_VER -> _MSC_VER (#6977)

ffe666572f llava-cli : multiple images (#6969)

24affa7db3 readme : update hot topics

Compare 12 commits »

happyz synced commits to refs/pull/6511/head at happyz/llama.cpp from mirror 2024-04-29 10:18:49 -07:00

1356c5ca1e *.py: fix flake8 warnings

3a544b35b6 convert-hf-to-gguf.py: print() --> logger

e04a4948b7 convert-hf-to-gguf.py: add additional logging

3980bcc23d constants.py: logger no longer required

6fe81af73c python-lint.yml: use .flake8 file instead

Compare 75 commits »

happyz synced commits to refs/pull/6412/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:49 -07:00

5819598d0a Merge 68614cec9463d39154cd91f61ab7f74106fbcbbe into b8a7a5a90f

b8a7a5a90f build(cmake): simplify instructions (`cmake -B build && cmake --build build ...`) (#6964)

d2c898f746 ci : tmp disable gguf-split (#6983)

544f1f10ad ggml : fix __MSC_VER -> _MSC_VER (#6977)

ffe666572f llava-cli : multiple images (#6969)

Compare 20 commits »

happyz synced commits to refs/pull/6403/merge at happyz/llama.cpp from mirror 2024-04-29 10:18:49 -07:00

041cdf142e Merge 095647bf5d into 544f1f10ad

544f1f10ad ggml : fix __MSC_VER -> _MSC_VER (#6977)

ffe666572f llava-cli : multiple images (#6969)

24affa7db3 readme : update hot topics

f4ab2a4147 llama : fix BPE pre-tokenization (#6920)

Compare 40 commits »

happyz synced commits to refs/pull/6412/head at happyz/llama.cpp from mirror 2024-04-29 10:18:49 -07:00

68614cec94 Apply ggerganov's fixes for test-backend-ops

88b97b8b8d Fix documentation

ed1b1d0745 Make the GGML header look nicer

49100a84d9 Remove bf16 luts

700db7d457 Minimize the GGML API surface area for BF16

Compare 18 commits »