Jan Wassenberg
e1585ecaf5
Update Highway version to get NEON bf16 fix
...
https://github.com/google/highway/pull/2598
PiperOrigin-RevId: 774664346
2025-06-23 01:25:01 -07:00
Jan Wassenberg
4f5785b0fd
Update instrumentation for new Highway wall-time profiler
...
Pass the thread index through and use new zone_id.
PiperOrigin-RevId: 773344242
2025-06-19 07:46:04 -07:00
Jan Wassenberg
d342e4e7d4
Also add CMAKE_CXX_STANDARD in examples' CMake files
...
PiperOrigin-RevId: 772454497
2025-06-17 06:53:54 -07:00
Jan Wassenberg
627cc04db9
Decouple MatMul from gemma-inl: precompile for all input types
...
Call MatMulStatic instead of MatMul.
Also fix build error due to Highway's Lanes not being constexpr.
PiperOrigin-RevId: 763777269
2025-05-27 07:08:58 -07:00
Jan Wassenberg
1b72c22345
Refactor Gemma ctor and improve pool NUMA support
...
Gemma receives a MatMulEnv arg, with comment on lifetime
Split threading into topology so the latter can be used in allocator
Add AllocClasses() for non-POD (ThreadPool)
Support binding pool to NUMA node
Update threading_test with latency measurements
Also update Highway version.
PiperOrigin-RevId: 736904748
2025-03-14 10:19:00 -07:00
Jan Wassenberg
a60b564b88
Infra improvements (2)
...
ops.h: move CreateInvTimescale to allow calling without depending on gemma
Pass around MatMulEnv instead of pools to avoid re-creating the env
profiler.h can now be used outside SIMD code
allocator: add StepBytes and QuantumSteps
rename worker thread with package/cluster in the name
threading: add Visit* to IndexRange
PiperOrigin-RevId: 718766704
2025-01-23 01:55:19 -08:00
Jan Wassenberg
6a34e9c547
Print cache info and update Highway version for that
...
PiperOrigin-RevId: 702318451
2024-12-03 06:31:52 -08:00
austinvhuang
72247614bb
fix prefill feedback off-by-1, update fetch commit hash
2024-03-12 15:10:44 -04:00
austinvhuang
60d054e041
move arg definitions out of gemma.h to app.h
2024-03-10 23:49:25 -04:00
austinvhuang
0fc80fad05
libgemma refactor - review changes
2024-03-10 12:55:08 -04:00
austinvhuang
cc5c24c4f8
remove app.h dependency + fix bazel build
2024-03-08 18:06:43 -05:00
austinvhuang
8c7b2cf61b
add README, license to hello_world
2024-03-08 17:59:54 -05:00
austinvhuang
571a5449c4
update commit hash for gemma lib
2024-03-08 17:33:33 -05:00
austinvhuang
03147effbd
update loader arg names: cache -> compressed_weights, model -> weights
2024-03-08 17:32:36 -05:00
austinvhuang
dfd2fdc1dd
Decouple gemma constructor from loader args, update hello_world example, add convenience version of constructor (no uncompressed weights)
2024-03-08 17:26:03 -05:00
austinvhuang
49e654258d
[WIP] clean up hello_world #includes and CMakeLists.txt
2024-03-07 01:04:25 -05:00
austinvhuang
e781007836
[WIP] Remove InferenceArgs from hello_world example, fix ordering of LoaderArgs validation, revert ReplGemma EOT token behavior
2024-03-06 23:21:13 -05:00
austinvhuang
c378ac2c56
[WIP] hello world example working. TODO: refactor interfaces to decouple arguments
2024-03-03 11:36:48 -05:00