Commit Graph

  • 60eed010ba Internal changes dev Krzysztof Rymski 2026-01-29 04:47:35 -0800
  • ca6d5a88dd build: update CMake paths for io relocation Olamiposi Otesile 2026-01-13 23:55:35 +0100
  • c3c1ed7f00 fix: update header include paths in C++ files Olamiposi Otesile 2026-01-13 23:33:02 +0100
  • 4c56598b74 ci: update bazel build target to gemma_main Olamiposi Otesile 2026-01-13 23:22:13 +0100
  • ec105435bd fix: a global update of all io paths to gemma/io Olamiposi Otesile 2026-01-13 18:55:07 +0100
  • 1c5f712672 update all project references to the new gemma/io path Olamiposi Otesile 2026-01-13 18:50:36 +0100
  • a9ab913196 move io folder inside gemma directory Olamiposi Otesile 2026-01-13 18:34:29 +0100
  • 8d3682d1d3 finalize io path migration Olamiposi Otesile 2026-01-13 17:32:03 +0100
  • a0bb7b5527
    Merge branch 'dev' into main Ola Otesile 2026-01-12 06:18:51 -0800
  • b99790450c Restored original filenames. kept BlobReader to BlobFinder class rename Olamiposi Otesile 2026-01-06 22:04:30 +0100
  • 16a7ba2d6e Internal changes Krzysztof Rymski 2026-01-09 06:35:05 -0800
  • 6d43d6ee19 Build fix for Arm SVE (invalid template arg on op) Jan Wassenberg 2026-01-09 02:55:28 -0800
  • 95592a574e Build fix for Arm SVE (explicit namespace qualification) The gemma.cpp Authors 2026-01-08 13:29:15 -0800
  • 42e9cf557d Internal change / remove unused PrintSpeed Jan Wassenberg 2026-01-08 05:25:54 -0800
  • 384c390181 Allow overriding hardcoded max_seq_len by cmdline argument seq_len. Balazs Racz 2026-01-08 04:28:32 -0800
  • aeade052c6 Move AssertClose to test_util, add U16 Jan Wassenberg 2026-01-07 10:32:44 -0800
  • 2ee1fac74c Internal changes Krzysztof Rymski 2026-01-07 01:21:02 -0800
  • 5579abb4e6 Merge remote-tracking branch 'upstream/dev' Olamiposi Otesile 2026-01-02 22:35:58 +0100
  • 733bbddb7a Refactor: Rename BlobReader to BlobFinder Olamiposi Otesile 2025-12-26 13:48:49 +0100
  • 1605925d1e Add int8 quantization stats Jan Wassenberg 2025-12-19 12:42:29 -0800
  • 11aa16a13d Merge pull request #810 from salmanmkc:upgrade-github-actions-node24 Copybara-Service 2025-12-19 05:27:14 -0800
  • 08a0760271 Internal changes Krzysztof Rymski 2025-12-19 03:42:36 -0800
  • b73a9ede8f Internal changes Krzysztof Rymski 2025-12-19 02:45:52 -0800
  • 0ac55f71ed Avoid using Row() for unaligned storage. Balazs Racz 2025-12-18 05:10:21 -0800
  • 6661d3a60c Internal changes Krzysztof Rymski 2025-12-18 01:26:09 -0800
  • 142e6a7e9c No public description Liam Miller-Cushon 2025-12-17 20:10:24 -0800
  • b8a409dbba Use hn::Sub for vector subtraction in flash attention. Phil Culliton 2025-12-17 12:57:01 -0800
  • 596bdfe5af Separate monolithic gemma_lib library into more specific cc_library targets. Balazs Racz 2025-12-17 03:30:34 -0800
  • a4c78d4454
    Merge branch 'dev' into upgrade-github-actions-node24 Salman Chishti 2025-12-16 14:47:59 +0000
  • b66aa115ac
    Upgrade GitHub Actions for Node 24 compatibility Salman Muin Kayser Chishti 2025-12-16 14:26:24 +0000
  • baa69dfb78 Makes the entire runtime_config passed into the activations constructor. Balazs Racz 2025-12-16 01:56:18 -0800
  • 44dfd69b9b Internal changes Krzysztof Rymski 2025-12-15 07:14:04 -0800
  • 0c64987a96 Abort if args are unrecognized, refactor argument passing Jan Wassenberg 2025-12-15 03:18:11 -0800
  • f50550f4ce Warning fixes (sign mismatch), switch default Jan Wassenberg 2025-12-15 02:40:45 -0800
  • 506fb22be7 No public description Martin Stolle 2025-12-12 06:36:40 -0800
  • 338cd8a36e Factors out a new cc_library `:query` from `:gemma-lib`. Moves query-related structs/classes to gemma/query.h. Balazs Racz 2025-12-12 02:53:22 -0800
  • 73c3627b67 Add tensor stats and output Jan Wassenberg 2025-12-11 22:51:50 -0800
  • bfc0dfcfca Enable flags= parsing Martin Stolle 2025-12-11 01:17:34 -0800
  • 78deacc357 Make attention configurable on the command line. Martin Stolle 2025-12-10 09:33:21 -0800
  • 2441ff01bf internal change Martin Stolle 2025-12-10 09:00:41 -0800
  • 64178ace38 Internal changes Krzysztof Rymski 2025-12-10 07:54:38 -0800
  • 9689fc82f9 internal change Martin Stolle 2025-12-09 06:16:33 -0800
  • 64d700cab5 Internal changes Krzysztof Rymski 2025-12-09 05:41:31 -0800
  • a8ff2e4175 Make random numbers in FillRandom not depend on shape. Martin Stolle 2025-12-09 04:56:26 -0800
  • 14a9ecf21d Factor out SumHeads Martin Stolle 2025-12-09 02:22:46 -0800
  • 1014ae9e2a Adding a simple test for GemmaAttention Martin Stolle 2025-12-09 02:12:30 -0800
  • 61dedf73ed Internal changes test_841765739 Krzysztof Rymski 2025-12-08 08:00:00 -0800
  • 5a6895c609 Avoid warning when OS affinity limits us to the second socket Jan Wassenberg 2025-12-08 07:09:59 -0800
  • 60b23bcc9e
    Merge 2b9245ad93 into b510ba2ab2 copybara-service[bot] 2025-12-08 14:45:36 +0000
  • 2b9245ad93 Avoid warning when OS affinity limits us to the second socket Jan Wassenberg 2025-12-08 05:38:26 -0800
  • 57f2664fa7 internal change test_840724686 Martin Stolle 2025-12-05 07:43:55 -0800
  • b510ba2ab2 Improve clarity of indices II Martin Stolle 2025-12-04 06:32:56 -0800
  • 9348048885 Clean up toPtrs to delegate to toPtr Martin Stolle 2025-12-04 06:21:35 -0800
  • 2b4436beb6 Internal changes Krzysztof Rymski 2025-12-04 02:37:21 -0800
  • d2090fddf3 Improve clarity of indices Martin Stolle 2025-12-03 10:10:47 -0800
  • 6d3e2b6f73 Add missing includes. Nitin Gangahar 2025-12-02 23:22:41 -0800
  • a084d33e41 Fix Gemma3 image: ensure A matrix is packed, preallocate Jan Wassenberg 2025-12-01 11:46:47 -0800
  • 1564dd3111 Fix empty enabled_lps in topology detection Jan Wassenberg 2025-12-01 10:23:14 -0800
  • 6e5e4123f1 Internal changes Krzysztof Rymski 2025-11-28 02:36:36 -0800
  • 3c9e6cf113 Expand debug output for topology Jan Wassenberg 2025-11-28 00:19:05 -0800
  • ccb49bc82f Add ToFloatSlow, move RandomFloat to test_util Jan Wassenberg 2025-11-27 00:14:19 -0800
  • 3bc8da8d7b
    Merge b959ea1a22 into c153d5255b copybara-service[bot] 2025-11-27 08:04:31 +0000
  • b959ea1a22 Add ToFloatSlow, move RandomFloat to test_util Jan Wassenberg 2025-11-25 20:24:05 -0800
  • b31e8f98e8 Internal changes test_836654012 Krzysztof Rymski 2025-11-25 07:05:19 -0800
  • c153d5255b Internal changes Krzysztof Rymski 2025-11-26 01:05:06 -0800
  • 8696f6dd17 Clarify indices Martin Stolle 2025-11-24 08:27:23 -0800
  • 37a25c9ffe Fix warning (signed vs unsigned) Jan Wassenberg 2025-11-24 00:50:40 -0800
  • 0e5f4cbf1b Implement Continus Batching. Charles Zhao 2025-11-23 23:53:28 -0800
  • 88a03b7ec4 Added access to softmax attention internals to regular attention Martin Stolle 2025-11-21 09:00:23 -0800
  • e8b4aaf0b0
    Merge 210ebab346 into 5a500872b8 copybara-service[bot] 2025-11-21 16:50:38 +0000
  • 210ebab346 Added access to softmax attention internals to regular attention Martin Stolle 2025-11-17 08:35:09 -0800
  • d6504d12a2 Internal changes Krzysztof Rymski 2025-11-21 04:08:32 -0800
  • fc70d7cb0e Internal changes Krzysztof Rymski 2025-11-21 03:59:31 -0800
  • 15f503e181 Internal changes Krzysztof Rymski 2025-11-21 03:53:24 -0800
  • be30473dc6 Internal changes Krzysztof Rymski 2025-11-21 04:02:18 -0800
  • 5a500872b8 Internal change Martin Stolle 2025-11-21 01:17:06 -0800
  • 49d420aeaf Add some comments. Martin Stolle 2025-11-19 01:08:38 -0800
  • b8f6be72b1 Improves autodetection of Gemma3-1B. The gemma.cpp Authors 2025-11-17 01:12:07 -0800
  • 7c1656f2fc Fix NibbleCodec for AVX3_{ZEN4,DL,SPR} The gemma.cpp Authors 2025-11-11 11:30:45 -0800
  • 3e18db17f4 Avoid hard-coding kPatchSize. Thanks @Somet2mes for reporting. Fixes #762. Jan Wassenberg 2025-11-07 00:31:59 -0800
  • f8131339a7 Refactor for continous batching. This cl does not change the current behavior of the code. It only extract two functions that will later be called for adding continuous batching. Charles Zhao 2025-11-06 14:19:38 -0800
  • 35e9f9f05f Introduce attention implementation configurability. Martin Stolle 2025-11-06 08:43:03 -0800
  • 091b4567c9 Minor: ParallelismStrategy->Parallelism Jan Wassenberg 2025-11-06 06:55:37 -0800
  • a344a70c59 Change (old) attention behavior to disallow wraparound, enforced via assertion. Shared kU64PerLine constant Jan Wassenberg 2025-11-04 11:52:07 -0800
  • 3a63a12624 Allow prefill only run by allowing max_prompt_size == seq_len Charles Zhao 2025-11-03 03:17:22 -0800
  • ab87807a4c Pre-compress query activations to BF16 before FlashAttention. Phil Culliton 2025-10-31 09:49:07 -0700
  • 5243cc31f1 Internal change. Phil Culliton 2025-10-29 10:42:21 -0700
  • 8a100c1e8d Added access to flash attention internals to TileFlashAttention4 Ray Smith 2025-10-30 06:49:33 -0700
  • ee7d79c0a6 Add Decompress2AndCompressInplace helper Jan Wassenberg 2025-10-30 04:04:08 -0700
  • 006999063c Fix PaliGemma matmul warning Jan Wassenberg 2025-10-29 11:13:20 -0700
  • ecab0cef3a Update README with Gemma 3 support and contributor acknowledgments Phil Culliton 2025-10-29 09:46:23 -0700
  • 036f91f63c Add Gemma 3 270M to gemma_test Phil Culliton 2025-10-29 09:30:46 -0700
  • 116cd6eff6 BF16 mixed-mode flash attention Phil Culliton 2025-10-29 01:47:59 -0700
  • 4bd465ffd3 Also update attention.h to type-erased query_norm_scale Jan Wassenberg 2025-10-28 06:47:56 -0700
  • 3cc0139ebb Fix excessive KC/MC from prior change Jan Wassenberg 2025-10-28 05:32:30 -0700
  • 877abf4da5
    Merge 267dbe00cb into 5a05857deb copybara-service[bot] 2025-10-28 04:22:08 +0000
  • 267dbe00cb Fixes to activations and tensor params test_824820179 Phil Culliton 2025-10-27 21:19:09 -0700
  • 5a05857deb [Gemma.cpp] Allows non-owned arguments for attention methods. Biruk Mammo 2025-10-27 10:42:46 -0700
  • 86200ce224 1.01x speedup: improved autotune Jan Wassenberg 2025-10-27 05:34:58 -0700
  • 8198e7104a Batch bench: 4 runs to give autotuning more time Jan Wassenberg 2025-10-24 09:14:07 -0700