Commit Graph

  • 8790458276
    Merge a4c78d4454 into baa69dfb78 Salman Chishti 2025-12-16 17:50:05 +0000
  • 72ff4b5b82
    Merge 001c356b02 into baa69dfb78 copybara-service[bot] 2025-12-16 17:32:36 +0000
  • 001c356b02 Use hn::Sub for vector subtraction in flash attention. test_845310753 Phil Culliton 2025-12-16 09:31:00 -0800
  • 52dcceec6b
    Merge 85e2e8ae7f into baa69dfb78 copybara-service[bot] 2025-12-16 15:14:14 +0000
  • 85e2e8ae7f Separate monolithic gemma_lib library into more specific cc_library targets. test_845238905 Balazs Racz 2025-12-16 06:22:24 -0800
  • a4c78d4454
    Merge branch 'dev' into upgrade-github-actions-node24 Salman Chishti 2025-12-16 14:47:59 +0000
  • b66aa115ac
    Upgrade GitHub Actions for Node 24 compatibility Salman Muin Kayser Chishti 2025-12-16 14:26:24 +0000
  • baa69dfb78 Makes the entire runtime_config passed into the activations constructor. dev Balazs Racz 2025-12-16 01:56:18 -0800
  • 44dfd69b9b Internal changes Krzysztof Rymski 2025-12-15 07:14:04 -0800
  • f2380025f2
    Merge b31e8f98e8 into 0c64987a96 copybara-service[bot] 2025-12-15 11:25:37 +0000
  • 0c64987a96 Abort if args are unrecognized, refactor argument passing Jan Wassenberg 2025-12-15 03:18:11 -0800
  • f50550f4ce Warning fixes (sign mismatch), switch default Jan Wassenberg 2025-12-15 02:40:45 -0800
  • f08a42f18e
    Merge c783b82a82 into 506fb22be7 copybara-service[bot] 2025-12-14 14:14:38 +0000
  • 506fb22be7 No public description Martin Stolle 2025-12-12 06:36:40 -0800
  • 338cd8a36e Factors out a new cc_library `:query` from `:gemma-lib`. Moves query-related structs/classes to gemma/query.h. Balazs Racz 2025-12-12 02:53:22 -0800
  • 73c3627b67 Add tensor stats and output Jan Wassenberg 2025-12-11 22:51:50 -0800
  • bfc0dfcfca Enable flags= parsing Martin Stolle 2025-12-11 01:17:34 -0800
  • 78deacc357 Make attention configurable on the command line. Martin Stolle 2025-12-10 09:33:21 -0800
  • 2441ff01bf internal change Martin Stolle 2025-12-10 09:00:41 -0800
  • 64178ace38 Internal changes Krzysztof Rymski 2025-12-10 07:54:38 -0800
  • 9689fc82f9 internal change Martin Stolle 2025-12-09 06:16:33 -0800
  • 64d700cab5 Internal changes Krzysztof Rymski 2025-12-09 05:41:31 -0800
  • a8ff2e4175 Make random numbers in FillRandom not depend on shape. Martin Stolle 2025-12-09 04:56:26 -0800
  • 14a9ecf21d Factor out SumHeads Martin Stolle 2025-12-09 02:22:46 -0800
  • 1014ae9e2a Adding a simple test for GemmaAttention Martin Stolle 2025-12-09 02:12:30 -0800
  • 1a12c4d1a6
    Merge 61dedf73ed into 5a6895c609 copybara-service[bot] 2025-12-09 09:55:42 +0000
  • 61dedf73ed Internal changes test_841765739 Krzysztof Rymski 2025-12-08 08:00:00 -0800
  • 5a6895c609 Avoid warning when OS affinity limits us to the second socket Jan Wassenberg 2025-12-08 07:09:59 -0800
  • 60b23bcc9e
    Merge 2b9245ad93 into b510ba2ab2 copybara-service[bot] 2025-12-08 14:45:36 +0000
  • 2b9245ad93 Avoid warning when OS affinity limits us to the second socket Jan Wassenberg 2025-12-08 05:38:26 -0800
  • 7a505bfb16
    Merge 57f2664fa7 into b510ba2ab2 copybara-service[bot] 2025-12-05 15:44:55 +0000
  • 57f2664fa7 internal change test_840724686 Martin Stolle 2025-12-05 07:43:55 -0800
  • b510ba2ab2 Improve clarity of indices II Martin Stolle 2025-12-04 06:32:56 -0800
  • 9348048885 Clean up toPtrs to delegate to toPtr Martin Stolle 2025-12-04 06:21:35 -0800
  • 2b4436beb6 Internal changes Krzysztof Rymski 2025-12-04 02:37:21 -0800
  • d2090fddf3 Improve clarity of indices Martin Stolle 2025-12-03 10:10:47 -0800
  • 6d3e2b6f73 Add missing includes. Nitin Gangahar 2025-12-02 23:22:41 -0800
  • a084d33e41 Fix Gemma3 image: ensure A matrix is packed, preallocate Jan Wassenberg 2025-12-01 11:46:47 -0800
  • 1564dd3111 Fix empty enabled_lps in topology detection Jan Wassenberg 2025-12-01 10:23:14 -0800
  • 6e5e4123f1 Internal changes Krzysztof Rymski 2025-11-28 02:36:36 -0800
  • 3c9e6cf113 Expand debug output for topology Jan Wassenberg 2025-11-28 00:19:05 -0800
  • ccb49bc82f Add ToFloatSlow, move RandomFloat to test_util Jan Wassenberg 2025-11-27 00:14:19 -0800
  • 3bc8da8d7b
    Merge b959ea1a22 into c153d5255b copybara-service[bot] 2025-11-27 08:04:31 +0000
  • b959ea1a22 Add ToFloatSlow, move RandomFloat to test_util Jan Wassenberg 2025-11-25 20:24:05 -0800
  • b31e8f98e8 Internal changes test_836654012 Krzysztof Rymski 2025-11-25 07:05:19 -0800
  • c153d5255b Internal changes Krzysztof Rymski 2025-11-26 01:05:06 -0800
  • 8696f6dd17 Clarify indices Martin Stolle 2025-11-24 08:27:23 -0800
  • 37a25c9ffe Fix warning (signed vs unsigned) Jan Wassenberg 2025-11-24 00:50:40 -0800
  • 0e5f4cbf1b Implement Continus Batching. Charles Zhao 2025-11-23 23:53:28 -0800
  • 670281d31e
    Merge 15f503e181 into 88a03b7ec4 copybara-service[bot] 2025-11-24 12:16:21 +0530
  • 2741800aee
    Merge fc70d7cb0e into 88a03b7ec4 copybara-service[bot] 2025-11-24 12:16:07 +0530
  • 1faf6513e1
    Merge be30473dc6 into 88a03b7ec4 copybara-service[bot] 2025-11-24 12:15:54 +0530
  • 88a03b7ec4 Added access to softmax attention internals to regular attention Martin Stolle 2025-11-21 09:00:23 -0800
  • e8b4aaf0b0
    Merge 210ebab346 into 5a500872b8 copybara-service[bot] 2025-11-21 16:50:38 +0000
  • 210ebab346 Added access to softmax attention internals to regular attention Martin Stolle 2025-11-17 08:35:09 -0800
  • d6504d12a2 Internal changes Krzysztof Rymski 2025-11-21 04:08:32 -0800
  • fc70d7cb0e Internal changes test_835159997 Krzysztof Rymski 2025-11-21 03:59:31 -0800
  • 15f503e181 Internal changes test_835158854 Krzysztof Rymski 2025-11-21 03:53:24 -0800
  • be30473dc6 Internal changes test_835160876 Krzysztof Rymski 2025-11-21 04:02:18 -0800
  • 5a500872b8 Internal change Martin Stolle 2025-11-21 01:17:06 -0800
  • 49d420aeaf Add some comments. Martin Stolle 2025-11-19 01:08:38 -0800
  • b8f6be72b1 Improves autodetection of Gemma3-1B. The gemma.cpp Authors 2025-11-17 01:12:07 -0800
  • 7c1656f2fc Fix NibbleCodec for AVX3_{ZEN4,DL,SPR} The gemma.cpp Authors 2025-11-11 11:30:45 -0800
  • 3e18db17f4 Avoid hard-coding kPatchSize. Thanks @Somet2mes for reporting. Fixes #762. Jan Wassenberg 2025-11-07 00:31:59 -0800
  • f8131339a7 Refactor for continous batching. This cl does not change the current behavior of the code. It only extract two functions that will later be called for adding continuous batching. Charles Zhao 2025-11-06 14:19:38 -0800
  • 35e9f9f05f Introduce attention implementation configurability. Martin Stolle 2025-11-06 08:43:03 -0800
  • 091b4567c9 Minor: ParallelismStrategy->Parallelism Jan Wassenberg 2025-11-06 06:55:37 -0800
  • a344a70c59 Change (old) attention behavior to disallow wraparound, enforced via assertion. Shared kU64PerLine constant Jan Wassenberg 2025-11-04 11:52:07 -0800
  • 3a63a12624 Allow prefill only run by allowing max_prompt_size == seq_len Charles Zhao 2025-11-03 03:17:22 -0800
  • ab87807a4c Pre-compress query activations to BF16 before FlashAttention. Phil Culliton 2025-10-31 09:49:07 -0700
  • 48f4fd9cca
    Merge 5243cc31f1 into 8a100c1e8d copybara-service[bot] 2025-10-31 16:46:26 +0000
  • 5243cc31f1 Internal change. test_825613752 Phil Culliton 2025-10-29 10:42:21 -0700
  • 8a100c1e8d Added access to flash attention internals to TileFlashAttention4 Ray Smith 2025-10-30 06:49:33 -0700
  • ee7d79c0a6 Add Decompress2AndCompressInplace helper Jan Wassenberg 2025-10-30 04:04:08 -0700
  • 006999063c Fix PaliGemma matmul warning Jan Wassenberg 2025-10-29 11:13:20 -0700
  • ecab0cef3a Update README with Gemma 3 support and contributor acknowledgments Phil Culliton 2025-10-29 09:46:23 -0700
  • 036f91f63c Add Gemma 3 270M to gemma_test Phil Culliton 2025-10-29 09:30:46 -0700
  • 116cd6eff6 BF16 mixed-mode flash attention Phil Culliton 2025-10-29 01:47:59 -0700
  • 4bd465ffd3 Also update attention.h to type-erased query_norm_scale Jan Wassenberg 2025-10-28 06:47:56 -0700
  • 3cc0139ebb Fix excessive KC/MC from prior change Jan Wassenberg 2025-10-28 05:32:30 -0700
  • 877abf4da5
    Merge 267dbe00cb into 5a05857deb copybara-service[bot] 2025-10-28 04:22:08 +0000
  • 267dbe00cb Fixes to activations and tensor params test_824820179 Phil Culliton 2025-10-27 21:19:09 -0700
  • 5a05857deb [Gemma.cpp] Allows non-owned arguments for attention methods. Biruk Mammo 2025-10-27 10:42:46 -0700
  • 86200ce224 1.01x speedup: improved autotune Jan Wassenberg 2025-10-27 05:34:58 -0700
  • 8198e7104a Batch bench: 4 runs to give autotuning more time Jan Wassenberg 2025-10-24 09:14:07 -0700
  • 1bdde1af3c Add config flag for global timescale & rely on config to deduce wrapping Theotime Combes 2025-10-24 06:54:19 -0700
  • a48e614f64 1.02x speedup: improve load balance and simplify parallelFor Jan Wassenberg 2025-10-24 00:17:45 -0700
  • 085a34965a Update README since backprop and Adam optimizer has been deleted. Nitin Gangahar 2025-10-24 00:17:32 -0700
  • 3ed403e287 Major cleanup of profiler zones, add Caller annotation for all pool.Run main Jan Wassenberg 2025-10-23 01:53:50 -0700
  • 54837f42b5
    Merge 2d0b2c8129 into 9e8ac7e2f0 copybara-service[bot] 2025-10-22 19:48:20 +0200
  • 9e8ac7e2f0 Use correct offsets in BlobWriter. Nitin Gangahar 2025-10-22 10:28:30 -0700
  • 2d0b2c8129 Internal change. test_822643310 Nitin Gangahar 2025-10-22 10:21:15 -0700
  • 64a82ed645 Merge pull request #735 from Hitesh-ed:gemma.cpp-windows-build-fix Copybara-Service 2025-10-22 06:26:29 -0700
  • 027288b5e4
    Merge branch 'dev' into gemma.cpp-windows-build-fix Hitesh K V 2025-10-22 16:53:48 +0530
  • acede9d682 Warning fix (unused var), Windows build fix (missing member variable) Jan Wassenberg 2025-10-21 10:17:06 -0700
  • c55120fc6d
    Merge branch 'dev' into gemma.cpp-windows-build-fix Hitesh K V 2025-10-16 20:18:09 +0530
  • f59eb2ed72 Remove multi-package support from topology Jan Wassenberg 2025-10-16 04:00:06 -0700
  • cc1d256cff
    Update CMakePresets.json Hitesh K V 2025-10-16 12:08:29 +0530
  • 9b6ed1a58f gemma_batch_bench: generate more unique prompts Jan Wassenberg 2025-10-15 15:45:27 -0700
  • 503aaddd65 Add 8-bit integer quantization (I8Stream) to Gemma.cpp. Phil Culliton 2025-10-15 09:24:38 -0700