mirror of https://github.com/google/gemma.cpp.git
Optimizations - Better load-balancing in attention threading (Previously, clusters were limited by #heads) - Add MulByConstTo to avoid zero-init - Parallel activations Cleanup - Prepare for RowPtr in A or B - Pass through thread_id to ops - Avoid warning in bench_matmul PiperOrigin-RevId: 773723423 |
||
|---|---|---|
| .. | ||
| allocator.cc | ||
| allocator.h | ||
| args.h | ||
| basics.h | ||
| mat.cc | ||
| mat.h | ||
| test_util.h | ||
| threading.cc | ||
| threading.h | ||
| threading_context.cc | ||
| threading_context.h | ||
| threading_test.cc | ||
| topology.cc | ||
| topology.h | ||