HappyZ happyz
happyz synced commits to refs/pull/7669/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:54 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 5 commits »
happyz synced commits to refs/pull/7664/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:54 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 5 commits »
happyz synced commits to refs/pull/7654/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:54 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 5 commits »
happyz synced commits to refs/pull/7651/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:54 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 5 commits »
happyz synced commits to refs/pull/7649/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:54 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 6 commits »
happyz synced commits to refs/pull/7644/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:53 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 6 commits »
happyz synced commits to refs/pull/7642/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:53 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 5 commits »
happyz synced commits to refs/pull/7648/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:53 -07:00
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 2 commits »
happyz synced commits to refs/pull/7647/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:53 -07:00
7835cf8a06 Merge 9b156218751e7b601144b5122282045b081b2e5b into e141ce624a
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 6 commits »
happyz synced commits to refs/pull/7628/head at happyz/llama.cpp from mirror 2024-06-01 18:19:52 -07:00
2c3d0b42f3 Fix MUL_MAT_ID matrix vector shader and dispatch code
happyz synced commits to refs/pull/7606/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:52 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 5 commits »
happyz synced commits to refs/pull/7634/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:52 -07:00
5270ec3510 Merge 23f52991d0dba767caf939112caf803ca2ba54f3 into e141ce624a
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 5 commits »
happyz synced commits to refs/pull/7640/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:52 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 5 commits »
happyz synced commits to refs/pull/7628/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:52 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2c3d0b42f3 Fix MUL_MAT_ID matrix vector shader and dispatch code
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
Compare 6 commits »
happyz synced commits to refs/pull/7555/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:51 -07:00
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 3 commits »
happyz synced commits to refs/pull/7553/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:51 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 5 commits »
happyz synced commits to refs/pull/7599/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:51 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 5 commits »
happyz synced commits to refs/pull/7596/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:51 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 6 commits »
happyz synced commits to refs/pull/7582/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:51 -07:00
9d6f355a73 Merge 243b5efe0586c1e6fff749fbb52981e01d557bc7 into e141ce624a
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
2e666832e6 server : new UI (#7633)
2ac95c9d56 SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)
750f60c03e CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
Compare 5 commits »
happyz synced commits to refs/pull/7531/merge at happyz/llama.cpp from mirror 2024-06-01 18:19:50 -07:00
e141ce624a Fix FlashAttention debug test, FP32 assert (#7684)
61200ef29f llama : fix edge case finding batch seq_id of split recurrent cell
2e666832e6 server : new UI (#7633)
18d1c14047 llama : minimize swaps when reordering logits
Compare 53 commits »