HappyZ

happyz synced commits to refs/pull/7267/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:58 -07:00

91bee717aa Merge 2a9a84be7d into dc020985b8

dc020985b8 Avoid unnecessarily disabling CUDA graphs (#7302)

344f9126cc ggml : tag ggml_tensor::backend as deprecated (#7290)

9a17ab914b Add missing " (#7303)

ea3b0590ee embedding : free the batch after execution (#7297)

Compare 8 commits »

happyz synced commits to refs/pull/7225/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:57 -07:00

3e60809d79 Merge 38b348e28fc83ca701a7620af691cf9eeac0b615 into dc020985b8

dc020985b8 Avoid unnecessarily disabling CUDA graphs (#7302)

344f9126cc ggml : tag ggml_tensor::backend as deprecated (#7290)

9a17ab914b Add missing " (#7303)

ea3b0590ee embedding : free the batch after execution (#7297)

Compare 8 commits »

happyz synced commits to refs/pull/7239/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:57 -07:00

c2ea92740f Merge f4f5b7ac56 into ea3b0590ee

ea3b0590ee embedding : free the batch after execution (#7297)

29499bb593 sync : ggml

48aa8fd1f2 ggml : add `ggml_upscale_ext` (ggml/814)

583fd6b000 server bench: fix bench not waiting for model load (#7284)

Compare 5 commits »

happyz synced commits to refs/pull/7237/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:57 -07:00

bd37d6f971 Merge 9c5d3fcffb into dc020985b8

dc020985b8 Avoid unnecessarily disabling CUDA graphs (#7302)

344f9126cc ggml : tag ggml_tensor::backend as deprecated (#7290)

9a17ab914b Add missing " (#7303)

ea3b0590ee embedding : free the batch after execution (#7297)

Compare 8 commits »

happyz synced commits to refs/pull/7198/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:57 -07:00

274daed3de Merge cc3df3f388 into 583fd6b000

583fd6b000 server bench: fix bench not waiting for model load (#7284)

Compare 2 commits »

happyz synced commits to refs/pull/7191/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:57 -07:00

2dc2128633 Merge ece01fc2e9 into dc020985b8

ece01fc2e9 matmul-int8: remove unnecessary casts in q8_0_q8_0

dc020985b8 Avoid unnecessarily disabling CUDA graphs (#7302)

344f9126cc ggml : tag ggml_tensor::backend as deprecated (#7290)

9a17ab914b Add missing " (#7303)

Compare 9 commits »

happyz synced commits to refs/pull/7117/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:56 -07:00

87d4bb41f1 Merge 58551d0bd2 into dc020985b8

dc020985b8 Avoid unnecessarily disabling CUDA graphs (#7302)

344f9126cc ggml : tag ggml_tensor::backend as deprecated (#7290)

9a17ab914b Add missing " (#7303)

ea3b0590ee embedding : free the batch after execution (#7297)

Compare 8 commits »

happyz synced commits to refs/pull/7154/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:56 -07:00

e77c023b1e Merge d7359a389c into dc020985b8

dc020985b8 Avoid unnecessarily disabling CUDA graphs (#7302)

344f9126cc ggml : tag ggml_tensor::backend as deprecated (#7290)

9a17ab914b Add missing " (#7303)

ea3b0590ee embedding : free the batch after execution (#7297)

Compare 8 commits »

happyz synced commits to refs/pull/7191/head at happyz/llama.cpp from mirror 2024-05-15 10:41:56 -07:00

ece01fc2e9 matmul-int8: remove unnecessary casts in q8_0_q8_0

happyz synced commits to refs/pull/7020/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:56 -07:00

ceaa7b23e5 Merge 85263f0568 into dc020985b8

dc020985b8 Avoid unnecessarily disabling CUDA graphs (#7302)

344f9126cc ggml : tag ggml_tensor::backend as deprecated (#7290)

9a17ab914b Add missing " (#7303)

ea3b0590ee embedding : free the batch after execution (#7297)

Compare 54 commits »

happyz synced commits to refs/pull/6958/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:55 -07:00

dcb2e57317 Merge aa6f4c280b into 583fd6b000

583fd6b000 server bench: fix bench not waiting for model load (#7284)

9f773486ab script : sync ggml-rpc

e8a7fd4fb0 metal : support FA without mask + add asserts (#7278)

a5e3fde857 sync : ggml

Compare 11 commits »

happyz synced commits to refs/pull/6919/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:55 -07:00

dc5cfb80e1 Merge f0d7be409d into dc020985b8

dc020985b8 Avoid unnecessarily disabling CUDA graphs (#7302)

344f9126cc ggml : tag ggml_tensor::backend as deprecated (#7290)

9a17ab914b Add missing " (#7303)

ea3b0590ee embedding : free the batch after execution (#7297)

Compare 8 commits »

happyz synced commits to refs/pull/6915/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:55 -07:00

e15cbc01c5 Merge 14c104d166 into dc020985b8

14c104d166 Update ggml.c

f2aabab436 Update ggml.c

dc020985b8 Avoid unnecessarily disabling CUDA graphs (#7302)

344f9126cc ggml : tag ggml_tensor::backend as deprecated (#7290)

Compare 10 commits »

happyz synced commits to refs/pull/7020/head at happyz/llama.cpp from mirror 2024-05-15 10:41:55 -07:00

85263f0568 Minor fixes after merging.

7a5df5fbd5 Merge branch 'ggerganov:master' into snowflake-arctic

f4421f7cd8 convert-hf : Corrected sentencepiece API calls.

583fd6b000 server bench: fix bench not waiting for model load (#7284)

9f773486ab script : sync ggml-rpc

Compare 57 commits »

happyz synced commits to refs/pull/6999/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:55 -07:00

4a640cb235 Merge cb324f48f488e66321783fc0507cda7b7b993e56 into ea3b0590ee

ea3b0590ee embedding : free the batch after execution (#7297)

29499bb593 sync : ggml

48aa8fd1f2 ggml : add `ggml_upscale_ext` (ggml/814)

583fd6b000 server bench: fix bench not waiting for model load (#7284)

Compare 5 commits »

happyz synced commits to refs/pull/6988/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:55 -07:00

a83b8fb35a Merge a808370c58 into 583fd6b000

583fd6b000 server bench: fix bench not waiting for model load (#7284)

Compare 2 commits »

happyz synced commits to refs/pull/6839/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:54 -07:00

d6412bb63d Merge 49e078f79d into dc020985b8

dc020985b8 Avoid unnecessarily disabling CUDA graphs (#7302)

344f9126cc ggml : tag ggml_tensor::backend as deprecated (#7290)

9a17ab914b Add missing " (#7303)

ea3b0590ee embedding : free the batch after execution (#7297)

Compare 8 commits »

happyz synced commits to refs/pull/6840/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:54 -07:00

4bc45eae97 Merge bb3a5274c7c1efd883f7e57edb849c0394d2c91d into dc020985b8

dc020985b8 Avoid unnecessarily disabling CUDA graphs (#7302)

344f9126cc ggml : tag ggml_tensor::backend as deprecated (#7290)

9a17ab914b Add missing " (#7303)

ea3b0590ee embedding : free the batch after execution (#7297)

Compare 8 commits »

happyz synced commits to refs/pull/6866/merge at happyz/llama.cpp from mirror 2024-05-15 10:41:54 -07:00

5032b646c8 Merge d12c57b559 into 583fd6b000

583fd6b000 server bench: fix bench not waiting for model load (#7284)

9f773486ab script : sync ggml-rpc

e8a7fd4fb0 metal : support FA without mask + add asserts (#7278)

a5e3fde857 sync : ggml

Compare 22 commits »

happyz synced commits to refs/pull/6915/head at happyz/llama.cpp from mirror 2024-05-15 10:41:54 -07:00

14c104d166 Update ggml.c

f2aabab436 Update ggml.c

Compare 2 commits »