HappyZ happyz
happyz synced commits to test_643317254 at happyz/gemma.cpp from mirror 2024-06-14 09:23:41 -07:00
happyz synced commits to refs/pull/240/head at happyz/gemma.cpp from mirror 2024-06-14 09:23:41 -07:00
d3c6a45b59 Major duplicated code reduction in test/benchmarks
c15ff9529c Reduce duplication in Config* by inheriting no-SSM
ea525da967 Added MatMul_4x4_Batch which is MatMul_4x4, but with the first template arg moved to the first function arg, so the batch size (num A rows) can be variable at run-time.
1b40619864 Increase parallelism in ops_test
Compare 4 commits »
happyz synced new reference test_643317254 to happyz/gemma.cpp from mirror 2024-06-14 09:23:41 -07:00
happyz synced new reference test_643293654 to happyz/gemma.cpp from mirror 2024-06-14 09:23:41 -07:00
happyz synced and deleted reference refs/tags/refs/pull/240/merge at happyz/gemma.cpp from mirror 2024-06-14 09:23:40 -07:00
happyz synced commits to test_643293654 at happyz/gemma.cpp from mirror 2024-06-14 09:23:40 -07:00
happyz synced commits to dev at happyz/gemma.cpp from mirror 2024-06-14 09:23:40 -07:00
2228055bb8 Internal change.
29c0c574e6 Integrate matmul into FFW: 4.3x prefill speedup
198326a682 Removed now redundant non-batch matmul
b17631c95f Implement a missing (bf16, f32) tiled MatMul kernel.
d3c6a45b59 Major duplicated code reduction in test/benchmarks
Compare 5 commits »
happyz synced and deleted reference refs/tags/test_642345934 at happyz/gemma.cpp from mirror 2024-06-14 09:23:40 -07:00
happyz synced commits to refs/tags/b3146 at happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00
happyz synced new reference refs/tags/b3149 to happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00
happyz synced commits to refs/tags/b3149 at happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00
happyz synced new reference refs/tags/b3148 to happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00
happyz synced commits to refs/tags/b3148 at happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00
happyz synced new reference refs/tags/b3147 to happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00
happyz synced commits to refs/tags/b3147 at happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00
happyz synced new reference refs/tags/b3146 to happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00
happyz synced commits to refs/pull/7925/merge at happyz/llama.cpp from mirror 2024-06-14 09:23:36 -07:00
66ef1ceedf metal : utilize max shared memory for mul_mat_id (#7935)
e65bbf606c llama-bench : fix RPC indication (#7936)
6fcd1331ef llama : more checks before assuming FIM tokens (#7644)
41b9260f18 convert : add Poro-34B-chat tokenizer support (#7713)
Compare 5 commits »
happyz synced commits to refs/pull/7931/head at happyz/llama.cpp from mirror 2024-06-14 09:23:36 -07:00
7a8961fff5 delete redundant
happyz synced commits to refs/pull/7926/merge at happyz/llama.cpp from mirror 2024-06-14 09:23:36 -07:00
66ef1ceedf metal : utilize max shared memory for mul_mat_id (#7935)
e65bbf606c llama-bench : fix RPC indication (#7936)
6fcd1331ef llama : more checks before assuming FIM tokens (#7644)
41b9260f18 convert : add Poro-34B-chat tokenizer support (#7713)
Compare 5 commits »
happyz synced commits to refs/pull/7931/merge at happyz/llama.cpp from mirror 2024-06-14 09:23:36 -07:00
66ef1ceedf metal : utilize max shared memory for mul_mat_id (#7935)
e65bbf606c llama-bench : fix RPC indication (#7936)
6fcd1331ef llama : more checks before assuming FIM tokens (#7644)
41b9260f18 convert : add Poro-34B-chat tokenizer support (#7713)
Compare 6 commits »