gemma.cpp/ops
Jan Wassenberg 2c28b18eb0 Add NestedPools: one per socket/cluster
Use in dot_test
app.h: add new flags and rename num_threads to max_threads
matmul: Parallelize MatMulSlow and enable spinning, more large/fewer medium test cases
PiperOrigin-RevId: 683216386
2024-10-07 09:40:19 -07:00
..
dot-inl.h Also enable f64 dot/sum for <f32 inputs 2024-10-04 07:12:10 -07:00
dot_test.cc Add NestedPools: one per socket/cluster 2024-10-07 09:40:19 -07:00
fp_arith-inl.h Cascaded summation for Softmax 2024-09-20 10:31:23 -07:00
gemma_matvec_test.cc Major compression update, arbitrary-len unpack + new Dot 2024-09-10 02:22:19 -07:00
matmul-inl.h Major compression update, arbitrary-len unpack + new Dot 2024-09-10 02:22:19 -07:00
matmul.h Major MatMul update, 1.9-2.3x speedup on Zen4 via bf16 mul 2024-08-16 07:52:20 -07:00
matmul_test.cc Add NestedPools: one per socket/cluster 2024-10-07 09:40:19 -07:00
matvec-inl.h Fix include order, required to build with profiler enabled 2024-09-30 07:52:50 -07:00
ops-inl.h Also enable f64 dot/sum for <f32 inputs 2024-10-04 07:12:10 -07:00
ops_test.cc Rename one variable in SampleTopK and update TestSampleTopK. 2024-10-01 00:51:33 -07:00