HappyZ

happyz synced commits to test_643317254 at happyz/gemma.cpp from mirror 2024-06-14 09:23:41 -07:00

happyz synced commits to refs/pull/240/head at happyz/gemma.cpp from mirror 2024-06-14 09:23:41 -07:00

d3c6a45b59 Major duplicated code reduction in test/benchmarks

c15ff9529c Reduce duplication in Config* by inheriting no-SSM

ea525da967 Added MatMul_4x4_Batch which is MatMul_4x4, but with the first template arg moved to the first function arg, so the batch size (num A rows) can be variable at run-time.

1b40619864 Increase parallelism in ops_test

Compare 4 commits »

happyz synced new reference test_643317254 to happyz/gemma.cpp from mirror 2024-06-14 09:23:41 -07:00

happyz synced new reference test_643293654 to happyz/gemma.cpp from mirror 2024-06-14 09:23:41 -07:00

happyz synced and deleted reference refs/tags/refs/pull/240/merge at happyz/gemma.cpp from mirror 2024-06-14 09:23:40 -07:00

happyz synced commits to test_643293654 at happyz/gemma.cpp from mirror 2024-06-14 09:23:40 -07:00

happyz synced commits to dev at happyz/gemma.cpp from mirror 2024-06-14 09:23:40 -07:00

2228055bb8 Internal change.

29c0c574e6 Integrate matmul into FFW: 4.3x prefill speedup

198326a682 Removed now redundant non-batch matmul

b17631c95f Implement a missing (bf16, f32) tiled MatMul kernel.

d3c6a45b59 Major duplicated code reduction in test/benchmarks

Compare 5 commits »

happyz synced and deleted reference refs/tags/test_642345934 at happyz/gemma.cpp from mirror 2024-06-14 09:23:40 -07:00

happyz synced commits to refs/tags/b3146 at happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00

happyz synced new reference refs/tags/b3149 to happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00

happyz synced commits to refs/tags/b3149 at happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00

happyz synced new reference refs/tags/b3148 to happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00

happyz synced commits to refs/tags/b3148 at happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00

happyz synced new reference refs/tags/b3147 to happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00

happyz synced commits to refs/tags/b3147 at happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00

happyz synced new reference refs/tags/b3146 to happyz/llama.cpp from mirror 2024-06-14 09:23:37 -07:00

happyz synced commits to refs/pull/7925/merge at happyz/llama.cpp from mirror 2024-06-14 09:23:36 -07:00

f703134d4b Merge 88cc7d7878 into 66ef1ceedf

66ef1ceedf metal : utilize max shared memory for mul_mat_id (#7935)

e65bbf606c llama-bench : fix RPC indication (#7936)

6fcd1331ef llama : more checks before assuming FIM tokens (#7644)

41b9260f18 convert : add Poro-34B-chat tokenizer support (#7713)

Compare 5 commits »

happyz synced commits to refs/pull/7931/head at happyz/llama.cpp from mirror 2024-06-14 09:23:36 -07:00

7a8961fff5 delete redundant

happyz synced commits to refs/pull/7926/merge at happyz/llama.cpp from mirror 2024-06-14 09:23:36 -07:00

f2f3ace478 Merge 7a5d932eaf into 66ef1ceedf

66ef1ceedf metal : utilize max shared memory for mul_mat_id (#7935)

e65bbf606c llama-bench : fix RPC indication (#7936)

6fcd1331ef llama : more checks before assuming FIM tokens (#7644)

41b9260f18 convert : add Poro-34B-chat tokenizer support (#7713)

Compare 5 commits »

happyz synced commits to refs/pull/7931/merge at happyz/llama.cpp from mirror 2024-06-14 09:23:36 -07:00

267a96f134 Merge 7a8961fff5 into 66ef1ceedf

66ef1ceedf metal : utilize max shared memory for mul_mat_id (#7935)

e65bbf606c llama-bench : fix RPC indication (#7936)

6fcd1331ef llama : more checks before assuming FIM tokens (#7644)

41b9260f18 convert : add Poro-34B-chat tokenizer support (#7713)

Compare 6 commits »