mirror of https://github.com/google/gemma.cpp.git
Commit Graph
Select branches
Hide Pull Requests
dev
main
test_648971168
test_683370269
test_824820179
test_836654012
test_840724686
test_841765739
test_875649021
test_882132164
test_897620425
#102
#105
#107
#108
#109
#110
#111
#113
#114
#115
#116
#117
#118
#119
#120
#122
#123
#124
#125
#126
#127
#128
#129
#130
#131
#132
#133
#136
#137
#138
#139
#14
#140
#141
#142
#144
#145
#146
#147
#148
#149
#150
#151
#154
#155
#156
#157
#157
#158
#159
#160
#162
#163
#165
#166
#167
#168
#168
#169
#170
#172
#173
#174
#175
#176
#177
#178
#179
#180
#181
#182
#183
#184
#186
#187
#188
#189
#190
#191
#192
#194
#195
#199
#200
#201
#202
#203
#204
#205
#206
#207
#208
#209
#210
#211
#212
#213
#214
#215
#216
#217
#218
#219
#22
#220
#222
#223
#224
#225
#226
#227
#228
#229
#230
#231
#232
#233
#234
#235
#236
#237
#238
#239
#24
#240
#241
#242
#243
#244
#245
#246
#247
#248
#249
#250
#250
#251
#253
#254
#255
#256
#257
#258
#259
#26
#260
#261
#262
#263
#264
#265
#266
#267
#268
#269
#270
#271
#272
#273
#274
#275
#277
#279
#280
#281
#282
#284
#286
#287
#287
#288
#289
#290
#291
#292
#293
#294
#295
#296
#297
#298
#299
#3
#300
#302
#303
#304
#305
#306
#307
#308
#309
#310
#311
#312
#313
#314
#315
#316
#317
#319
#32
#320
#321
#322
#323
#324
#325
#326
#327
#328
#329
#33
#330
#331
#333
#334
#335
#336
#337
#338
#339
#34
#342
#343
#344
#345
#346
#347
#348
#349
#35
#350
#351
#352
#353
#354
#355
#356
#357
#358
#359
#36
#360
#361
#362
#363
#364
#366
#367
#368
#369
#371
#372
#374
#375
#376
#377
#378
#379
#38
#380
#381
#382
#383
#386
#387
#388
#389
#390
#391
#392
#393
#394
#395
#396
#397
#398
#399
#40
#400
#402
#403
#404
#405
#406
#407
#408
#409
#41
#410
#411
#412
#413
#413
#414
#415
#416
#417
#418
#419
#42
#420
#421
#422
#424
#425
#426
#427
#428
#430
#431
#431
#432
#433
#434
#435
#436
#437
#438
#439
#440
#441
#442
#443
#444
#445
#446
#447
#448
#449
#450
#451
#452
#453
#454
#455
#456
#457
#458
#459
#460
#461
#462
#463
#464
#465
#466
#467
#468
#469
#47
#470
#471
#472
#473
#474
#475
#476
#477
#478
#479
#480
#481
#482
#483
#484
#485
#486
#487
#488
#489
#490
#491
#492
#493
#494
#495
#496
#497
#498
#499
#500
#502
#503
#504
#505
#506
#507
#509
#510
#511
#512
#513
#514
#515
#516
#517
#519
#520
#521
#523
#524
#525
#526
#527
#528
#529
#53
#530
#532
#534
#535
#536
#537
#538
#539
#541
#542
#545
#546
#547
#548
#549
#55
#550
#552
#553
#554
#555
#556
#557
#558
#559
#56
#561
#562
#563
#564
#565
#566
#567
#569
#570
#571
#572
#573
#574
#575
#576
#579
#58
#580
#581
#583
#584
#585
#586
#587
#588
#589
#590
#591
#592
#593
#594
#595
#596
#597
#598
#599
#6
#600
#601
#602
#603
#604
#605
#606
#607
#609
#609
#61
#610
#610
#611
#612
#613
#614
#615
#616
#617
#618
#619
#620
#621
#622
#624
#624
#626
#627
#628
#629
#63
#630
#633
#634
#635
#636
#637
#637
#638
#639
#640
#641
#642
#643
#644
#645
#646
#647
#649
#65
#650
#651
#652
#653
#654
#655
#656
#657
#658
#659
#66
#660
#663
#664
#666
#667
#668
#669
#67
#670
#671
#672
#673
#674
#675
#676
#677
#678
#679
#68
#680
#682
#683
#684
#685
#686
#687
#689
#69
#690
#691
#692
#693
#694
#695
#696
#697
#698
#699
#700
#701
#702
#703
#704
#705
#706
#707
#708
#709
#71
#710
#711
#712
#713
#714
#715
#716
#717
#718
#719
#720
#721
#722
#723
#724
#725
#727
#728
#729
#730
#731
#732
#733
#734
#735
#736
#737
#738
#739
#74
#740
#741
#742
#743
#744
#745
#746
#747
#747
#748
#749
#75
#750
#751
#752
#753
#754
#755
#756
#757
#759
#76
#760
#761
#763
#764
#765
#766
#767
#768
#769
#77
#770
#770
#771
#772
#773
#774
#775
#776
#777
#779
#78
#780
#781
#781
#782
#783
#784
#785
#786
#787
#788
#789
#79
#790
#791
#792
#793
#794
#794
#795
#796
#797
#798
#799
#800
#801
#802
#803
#804
#805
#806
#807
#808
#809
#81
#810
#811
#812
#813
#814
#815
#816
#818
#819
#82
#820
#821
#822
#823
#824
#825
#827
#828
#829
#83
#830
#831
#832
#833
#834
#835
#836
#837
#838
#839
#840
#841
#842
#843
#844
#845
#846
#847
#847
#848
#849
#85
#850
#851
#852
#854
#855
#856
#857
#858
#859
#86
#86
#860
#861
#862
#863
#864
#865
#865
#866
#867
#868
#868
#869
#87
#870
#871
#872
#873
#874
#876
#877
#878
#879
#880
#881
#882
#883
#884
#886
#889
#890
#891
#892
#893
#894
#895
#896
#897
#899
#899
#9
#92
#93
#94
#95
#96
#97
#98
v0.1.0
v0.1.1
v0.1.2
v0.1.3
v0.1.4
Select branches
Hide Pull Requests
dev
main
test_648971168
test_683370269
test_824820179
test_836654012
test_840724686
test_841765739
test_875649021
test_882132164
test_897620425
#102
#105
#107
#108
#109
#110
#111
#113
#114
#115
#116
#117
#118
#119
#120
#122
#123
#124
#125
#126
#127
#128
#129
#130
#131
#132
#133
#136
#137
#138
#139
#14
#140
#141
#142
#144
#145
#146
#147
#148
#149
#150
#151
#154
#155
#156
#157
#157
#158
#159
#160
#162
#163
#165
#166
#167
#168
#168
#169
#170
#172
#173
#174
#175
#176
#177
#178
#179
#180
#181
#182
#183
#184
#186
#187
#188
#189
#190
#191
#192
#194
#195
#199
#200
#201
#202
#203
#204
#205
#206
#207
#208
#209
#210
#211
#212
#213
#214
#215
#216
#217
#218
#219
#22
#220
#222
#223
#224
#225
#226
#227
#228
#229
#230
#231
#232
#233
#234
#235
#236
#237
#238
#239
#24
#240
#241
#242
#243
#244
#245
#246
#247
#248
#249
#250
#250
#251
#253
#254
#255
#256
#257
#258
#259
#26
#260
#261
#262
#263
#264
#265
#266
#267
#268
#269
#270
#271
#272
#273
#274
#275
#277
#279
#280
#281
#282
#284
#286
#287
#287
#288
#289
#290
#291
#292
#293
#294
#295
#296
#297
#298
#299
#3
#300
#302
#303
#304
#305
#306
#307
#308
#309
#310
#311
#312
#313
#314
#315
#316
#317
#319
#32
#320
#321
#322
#323
#324
#325
#326
#327
#328
#329
#33
#330
#331
#333
#334
#335
#336
#337
#338
#339
#34
#342
#343
#344
#345
#346
#347
#348
#349
#35
#350
#351
#352
#353
#354
#355
#356
#357
#358
#359
#36
#360
#361
#362
#363
#364
#366
#367
#368
#369
#371
#372
#374
#375
#376
#377
#378
#379
#38
#380
#381
#382
#383
#386
#387
#388
#389
#390
#391
#392
#393
#394
#395
#396
#397
#398
#399
#40
#400
#402
#403
#404
#405
#406
#407
#408
#409
#41
#410
#411
#412
#413
#413
#414
#415
#416
#417
#418
#419
#42
#420
#421
#422
#424
#425
#426
#427
#428
#430
#431
#431
#432
#433
#434
#435
#436
#437
#438
#439
#440
#441
#442
#443
#444
#445
#446
#447
#448
#449
#450
#451
#452
#453
#454
#455
#456
#457
#458
#459
#460
#461
#462
#463
#464
#465
#466
#467
#468
#469
#47
#470
#471
#472
#473
#474
#475
#476
#477
#478
#479
#480
#481
#482
#483
#484
#485
#486
#487
#488
#489
#490
#491
#492
#493
#494
#495
#496
#497
#498
#499
#500
#502
#503
#504
#505
#506
#507
#509
#510
#511
#512
#513
#514
#515
#516
#517
#519
#520
#521
#523
#524
#525
#526
#527
#528
#529
#53
#530
#532
#534
#535
#536
#537
#538
#539
#541
#542
#545
#546
#547
#548
#549
#55
#550
#552
#553
#554
#555
#556
#557
#558
#559
#56
#561
#562
#563
#564
#565
#566
#567
#569
#570
#571
#572
#573
#574
#575
#576
#579
#58
#580
#581
#583
#584
#585
#586
#587
#588
#589
#590
#591
#592
#593
#594
#595
#596
#597
#598
#599
#6
#600
#601
#602
#603
#604
#605
#606
#607
#609
#609
#61
#610
#610
#611
#612
#613
#614
#615
#616
#617
#618
#619
#620
#621
#622
#624
#624
#626
#627
#628
#629
#63
#630
#633
#634
#635
#636
#637
#637
#638
#639
#640
#641
#642
#643
#644
#645
#646
#647
#649
#65
#650
#651
#652
#653
#654
#655
#656
#657
#658
#659
#66
#660
#663
#664
#666
#667
#668
#669
#67
#670
#671
#672
#673
#674
#675
#676
#677
#678
#679
#68
#680
#682
#683
#684
#685
#686
#687
#689
#69
#690
#691
#692
#693
#694
#695
#696
#697
#698
#699
#700
#701
#702
#703
#704
#705
#706
#707
#708
#709
#71
#710
#711
#712
#713
#714
#715
#716
#717
#718
#719
#720
#721
#722
#723
#724
#725
#727
#728
#729
#730
#731
#732
#733
#734
#735
#736
#737
#738
#739
#74
#740
#741
#742
#743
#744
#745
#746
#747
#747
#748
#749
#75
#750
#751
#752
#753
#754
#755
#756
#757
#759
#76
#760
#761
#763
#764
#765
#766
#767
#768
#769
#77
#770
#770
#771
#772
#773
#774
#775
#776
#777
#779
#78
#780
#781
#781
#782
#783
#784
#785
#786
#787
#788
#789
#79
#790
#791
#792
#793
#794
#794
#795
#796
#797
#798
#799
#800
#801
#802
#803
#804
#805
#806
#807
#808
#809
#81
#810
#811
#812
#813
#814
#815
#816
#818
#819
#82
#820
#821
#822
#823
#824
#825
#827
#828
#829
#83
#830
#831
#832
#833
#834
#835
#836
#837
#838
#839
#840
#841
#842
#843
#844
#845
#846
#847
#847
#848
#849
#85
#850
#851
#852
#854
#855
#856
#857
#858
#859
#86
#86
#860
#861
#862
#863
#864
#865
#865
#866
#867
#868
#868
#869
#87
#870
#871
#872
#873
#874
#876
#877
#878
#879
#880
#881
#882
#883
#884
#886
#889
#890
#891
#892
#893
#894
#895
#896
#897
#899
#899
#9
#92
#93
#94
#95
#96
#97
#98
v0.1.0
v0.1.1
v0.1.2
v0.1.3
v0.1.4
-
cc1d256cffUpdate CMakePresets.json
Hitesh K V
2025-10-16 12:08:29 +0530 -
9b6ed1a58f
gemma_batch_bench: generate more unique prompts
Jan Wassenberg
2025-10-15 15:45:27 -0700 -
503aaddd65
Add 8-bit integer quantization (I8Stream) to Gemma.cpp.
Phil Culliton
2025-10-15 09:24:38 -0700 -
ee18916abf
Removed the PROFILER_ZONE from the most highly called functions to reduce the overhead.
Ray Smith
2025-10-15 07:09:32 -0700 -
e3e8511e79
Initialization of profiler zones.
Ray Smith
2025-10-15 03:05:30 -0700 -
fb6fa793f4
Added a global (to gemma) zones list to enable most call sites to PROFILER_ZONE3 to avoid the sychronization required for the static const initialization of the zone handle. Improved flash_attention to enable profiling using the new zones.
Ray Smith
2025-10-14 08:30:23 -0700 -
3e9bb7df80Update README.md
Hitesh K V
2025-10-10 11:33:09 +0530 -
035273c184
tune pool kSpin mode in threading_context
Jan Wassenberg
2025-10-07 08:35:44 -0700 -
9dc802c7aa
Add logging to io.cc on failed write and read.
Nitin Gangahar
2025-10-06 10:25:07 -0700 -
684a0444e9
Reduced parallelism for TransposeQ, making each thread read and write within its own cache lines
Ray Smith
2025-10-02 08:14:37 -0700 -
277f396710
Reduced parallelism for TransposeQ, making each thread read and write within its own cache lines
Ray Smith
2025-10-02 05:00:19 -0700 -
14244664c8
Avoid transposing Q when it isn't needed
Ray Smith
2025-10-02 05:16:03 -0700 -
fe5a39990e
Improve FlashAttention threading:
Jan Wassenberg
2025-10-02 02:36:29 -0700 -
6098a022b3
Increased parallelism for RMSNormAndPositionalEncoding
Ray Smith
2025-10-01 07:10:40 -0700 -
2f6cbde8ff
Added a smaller tile size to flash attention for smaller batch sizes
Ray Smith
2025-09-30 05:48:50 -0700 -
4974f24832
Fixed bug with softcap in single flash attention
Ray Smith
2025-09-30 02:17:18 -0700 -
16536996d1
Remove less useful spammy log lines.
Nitin Gangahar
2025-09-29 02:28:04 -0700 -
667a3f117a
Utilize multiple cores to read weight batches.
Nitin Gangahar
2025-09-26 11:27:56 -0700 -
d15731d201
Used hn::BroadcastLane instead of Set(..., x.raw)
Ray Smith
2025-09-25 09:41:30 -0700 -
4f0c633248
(1) Added QueryResultAndMetrics and BatchQueryModelWithMetrics to also return TimingInfo besides query results.
Charles Zhao
2025-09-23 17:01:56 -0700 -
fac8aac4cb
Internal change
Jan Wassenberg
2025-09-22 05:36:32 -0700 -
501fdf000e
Remove no longer used MatVec
Jan Wassenberg
2025-09-19 09:02:44 -0700 -
b603425bf3
Fix batch inference: dangling reference
Jan Wassenberg
2025-09-16 08:01:21 -0700 -
f3bc1c17da
1.03x speedup: fused FFN
Jan Wassenberg
2025-09-15 10:25:59 -0700 -
59db30e209
add const restriction for benchmark_helper.cc, and paligemma_helper.cc to remove a few uncessary copies.
Charles Zhao
2025-09-14 16:26:55 -0700 -
c9b8479f7d
Added zero-initialization to att_out. Re-enabled flash attention when HWY_NATIVE_DOT_BF16 is not available.
Ray Smith
2025-09-12 07:47:36 -0700 -
2695aab5d2
Temporarily disable flash pending msan fix
Jan Wassenberg
2025-09-10 07:25:07 -0700 -
ba6131311a
Fix gemma_batch_bench for flash attention
Jan Wassenberg
2025-09-10 05:32:03 -0700 -
9457258330
Refactor MatMul to accept views in the kernel functions
Jan Wassenberg
2025-09-09 22:09:09 -0700 -
f10ac41a20
Added flash attention, with both a single-q function, and a register-tiled function. The register-tiled version achieves a speed-up by a factor of about 9.7 over the previous attention function on an AVX3-enabled machine.
Ray Smith
2025-09-09 08:04:45 -0700 -
24b1760f03
Refactor: move Worker to ThreadingContext, factor out MMDecompress
Jan Wassenberg
2025-09-09 07:55:39 -0700 -
461a9c7d1b
Matmul refactoring towards fusion
Jan Wassenberg
2025-09-09 07:13:03 -0700 -
34ceee6c30
Update MatMul comments, removing mention of partial.
Jan Wassenberg
2025-09-09 05:56:57 -0700 -
a5ab99e4ba
Memory use reduction: smaller/single MMStorage
Jan Wassenberg
2025-09-09 05:32:20 -0700 -
06e5da1e22
Cleanup: split CacheInfo from Allocator, MatMul helper functions
Jan Wassenberg
2025-09-08 02:23:29 -0700 -
6e52a835c6
Faster startup on tsan: use hierarchical parallelism for BF16 conversion
Jan Wassenberg
2025-09-07 22:50:01 -0700 -
cbe24eac51
1.15x speedup: parallel sampling, enabled by new RNG
Jan Wassenberg
2025-09-05 07:23:33 -0700 -
ad7d7a2713
Further adjust dot_test threshold (numerics)
Jan Wassenberg
2025-09-05 05:49:35 -0700 -
2b4c16e243
Remove Griffin support
Jan Wassenberg
2025-09-05 02:34:54 -0700 -
56186193c1
Replace mt19937 with new generator to enable parallel sampling
Jan Wassenberg
2025-09-04 23:48:37 -0700 -
5d1693e806
Internal change
Jan Wassenberg
2025-09-04 10:30:42 -0700 -
afd82376a5
Add AES-CTR RNG for parallel sampling (not yet used)
Jan Wassenberg
2025-09-04 05:58:08 -0700 -
4be4799727
Remove kMaxPackages and per-package-related code
Jan Wassenberg
2025-09-04 03:32:35 -0700 -
7263ab8445
MatMul simplification, threading strategy improvements
Jan Wassenberg
2025-09-03 21:44:39 -0700 -
74ffe079c4
Create separate MMStorage objects per cluster.
Marie White
2025-09-03 09:35:13 -0700 -
c783b82a82
Internal change
Phil Culliton
2025-09-03 08:35:20 -0700 -
b7b3d353db
Simplify MatMul: remove F32 special case (build time)
Jan Wassenberg
2025-09-02 04:28:49 -0700 -
1e3c853e80
Add ParallelFor wrapper function and one new mode
Jan Wassenberg
2025-09-02 01:39:28 -0700 -
3737224132
Add in-cluster parallel policy. Update policy to include cluster_idx.
Marie White
2025-09-02 00:14:05 -0700 -
27cb8e12d9
Handle non-threading parallel policy.
Marie White
2025-09-02 00:02:18 -0700 -
0d2e74d74a
Add MMOptions as an argument to Matmul.
Marie White
2025-09-01 23:46:07 -0700 -
229bd078a1
1.29x speedup: bf16 C1/C2. Extend most ops to any type, expand test coverage.
Jan Wassenberg
2025-09-01 06:32:24 -0700 -
bc0c0bac8b
Add non-threading parallel policy.
Marie White
2025-08-29 08:38:19 -0700 -
00b70f69c5
Include parallelism type in DoMatMul. Also remove package handling.
Marie White
2025-08-29 08:04:05 -0700 -
0ae8646731
Fix remainder handling for Paligemma
Jan Wassenberg
2025-08-29 07:25:14 -0700 -
973e284ed6
Refactor Matmul to use a policy class for parallelization.
Marie White
2025-08-29 05:40:06 -0700 -
6c39a2dea4
1.01x speedup: More bf16 activations to reduce DecompressA.
Jan Wassenberg
2025-08-29 03:18:28 -0700 -
7288891439
Remove F64 partial storage in matmul.
Jan Wassenberg
2025-08-29 00:11:31 -0700 -
31c09cca4c
f32 LoopKC: 1.37x(M=512), 1.19(M=128) single-K F32,BF16 matmul speedup on SKX
Jan Wassenberg
2025-08-28 08:55:15 -0700 -
98ddc166db
Expand ThreadingContext comments
Jan Wassenberg
2025-08-28 08:31:25 -0700 -
6128e758ff
Change ffw_out from B16 to F32.
Marie White
2025-08-28 00:01:01 -0700 -
85cc51795c
Internal change.
The gemma.cpp Authors
2025-08-26 08:07:23 -0700 -
5411fd846d
Minor: batched NotifyGenerate, fix comment/dep
Jan Wassenberg
2025-08-26 23:32:43 -0700 -
86afd53076
1.04x speedup: Parallelize SoftCap
Jan Wassenberg
2025-08-26 11:54:48 -0700 -
ed2f0bd1b0
Fix pos assertions, refs #665
Jan Wassenberg
2025-08-26 04:50:06 -0700 -
9bf0fe4e37
Internal change
Jan Wassenberg
2025-08-26 04:43:26 -0700 -
d3a5ddf657
Merge pull request #663 from junjihashimoto:feature/api-server
Jan Wassenberg
2025-08-24 11:57:05 +0200 -
73f1140dca
Fix an off-by-one error after StreamAndUpdateEOS() to remove the MSAN warning about reading an uninitialized variable in the kv_cache.
Rhett Stucki
2025-08-20 22:59:24 -0700 -
41321611fd
feature: add API server and client with Google protocol
Junji Hashimoto
2025-08-20 11:05:09 +0900 -
41a86d41a9
Fix preadv error: only enable if we have a handle
Jan Wassenberg
2025-08-15 06:30:07 -0700 -
78573b6718
Internal change. Add deduction for 270M.
Phil Culliton
2025-08-14 08:04:10 -0700 -
d044801c1d
Internal change
Phil Culliton
2025-08-13 09:47:05 -0700 -
71406cf6d0
More profiler interface fixes: hwy:: plus avoid ADD_ZONE
Jan Wassenberg
2025-08-13 03:15:07 -0700 -
faa4102992
(Resubmit) Prepare profiler annotations for new API
Jan Wassenberg
2025-08-13 01:37:53 -0700 -
a2d9133f7d
Prepare profiler annotations for new API
The gemma.cpp Authors
2025-08-11 17:51:09 -0700 -
4cbf63e6f0
Prepare profiler annotations for new API
Jan Wassenberg
2025-08-11 15:34:20 -0700 -
eef564e8f0
Prepare profiler annotations for new API
Jan Wassenberg
2025-08-08 16:50:54 -0700 -
2e9c93a609
Merge pull request #649 from KaranocaVe:main
Copybara-Service
2025-08-08 10:35:57 -0700 -
33fbac0880
Exporter updates/fixes
Jan Wassenberg
2025-08-04 22:35:59 -0700 -
4e062d68f7
Update BlobWriter comments, WriteAll->Finalize
Jan Wassenberg
2025-08-04 10:00:54 -0700 -
701841897b
Default to disabling per-socket parallelization
Jan Wassenberg
2025-08-04 09:48:22 -0700 -
b56b2f05e4
Automated Code Change
Ivo Ristovski List
2025-08-01 13:29:16 -0700 -
eaf05cd04eMerge
6dd1cd277finto799c264df3copybara-service[bot]
2025-08-01 20:11:15 +0000 -
6dd1cd277f
Automated Code Change
The gemma.cpp Authors
2025-07-11 05:32:57 -0700 -
799c264df3
Pre-tune thread pool before matmul
Jan Wassenberg
2025-07-31 08:44:47 -0700 -
32286f0465Merge branch 'dev' into main
KaranocaVe
2025-07-31 22:40:56 +0800 -
50ee1a3e92
Write SBS progressively.
Charles Zhao
2025-07-31 06:05:02 -0700 -
0ea118ebbe
Update run.cc, CMakeLists and README for incompatible code, dependency changes and argument updates
KaranocaVe
2025-07-31 00:59:16 +0800 -
8715eda512
Improved layer idx parsing
Jan Wassenberg
2025-07-30 05:49:13 -0700 -
d831ddce5b
Fix file mapping: was letting the smart pointer go out of scope
Jan Wassenberg
2025-07-30 04:29:27 -0700 -
2141d4788d
Add IsAppendOnly flag to file and if true, disable parallel writes
Jan Wassenberg
2025-07-30 01:51:08 -0700 -
d22ba2ac96
Update layer index parsing and allow tokenizer override
Jan Wassenberg
2025-07-30 01:21:54 -0700 -
d1638587f0
1.14x batch decode speedup: parallelize RMSNorm ops
Jan Wassenberg
2025-07-30 00:54:55 -0700 -
ac0d751d20
Rename GetModelConfig->Config
Jan Wassenberg
2025-07-29 10:17:14 -0700 -
33fabd4ed1
Internal change.
Jeremiah Harmsen
2025-07-29 08:20:36 -0700 -
e76e29ce11
De-singleton ThreadingContext so callers can pass in their own
Jan Wassenberg
2025-07-22 02:07:58 -0700 -
5474146129
Back to f32 kv_cache, but via typedef
Jan Wassenberg
2025-07-21 07:04:55 -0700 -
56c9196eb6
Add blob_path to config deduction message
Jan Wassenberg
2025-07-11 18:58:16 -0700 -
349c86f2d9
Fix bench_matmul perf regression: A input should be padded
Jan Wassenberg
2025-07-11 07:35:52 -0700 -
4bc44d5678
Minor: ModelWeightsPtrs -> WeightsPtrs
Jan Wassenberg
2025-07-11 06:10:51 -0700