Commit Graph

147 Commits

Author SHA1 Message Date
Jan Wassenberg 1b72c22345 Refactor Gemma ctor and improve pool NUMA support
Gemma receives a MatMulEnv arg, with comment on lifetime
Split threading into topology so the latter can be used in allocator
Add AllocClasses() for non-POD (ThreadPool)
Support binding pool to NUMA node
Update threading_test with latency measurements
Also update Highway version.

PiperOrigin-RevId: 736904748
2025-03-14 10:19:00 -07:00
Jan Wassenberg f9d93e4a42 Matmul rewrite: fp64 sums, hierarchical parallelization, cache-blocking, autotuning
Remove empty matmul_unit_test.
Up to 25 TFLOP/s on 2xZen4 for 512,3072,24576.

PiperOrigin-RevId: 729123576
2025-02-20 08:33:46 -08:00
Apoorv Reddy d854471ae2 Use vectorized TopK using highway VQSelect
PiperOrigin-RevId: 728159153
2025-02-18 05:01:39 -08:00
Apoorv Reddy 0e5b59d24d Implements FusedSoftmaxAndSampleTopK.
This computes softmax on the top-K logits, instead of computing softmax first and then getting top-K probs. So we end up avoiding renormalizing too. Additionally, modify softmax to do temperature scaling, if temp != 1.0

PiperOrigin-RevId: 727702149
2025-02-16 21:30:06 -08:00
Jan Wassenberg 5563d94811 Add fork/join latency benchmark
PiperOrigin-RevId: 725174042
2025-02-10 05:23:44 -08:00
Jan Wassenberg a60b564b88 Infra improvements (2)
ops.h: move CreateInvTimescale to allow calling without depending on gemma
Pass around MatMulEnv instead of pools to avoid re-creating the env
profiler.h can now be used outside SIMD code
allocator: add StepBytes and QuantumSteps
rename worker thread with package/cluster in the name
threading: add Visit* to IndexRange
PiperOrigin-RevId: 718766704
2025-01-23 01:55:19 -08:00
Jan Wassenberg c4398fc72d Infra improvements:
allocator: support mmap, fixed Bind, add padding
bench_matmul: Add PreventElision
BUILD: add ops_test build target
matmul.h: move ConstMat here; dynamic alloc of MatMulEnv
matmul_test: remove benchmarking
replace fprintf with HWY_WARN
threading.cc: support splitting large clusters (disabled); package_idx->pkg_idx, smaller IndexRangePartition
PiperOrigin-RevId: 717512274
2025-01-20 06:22:49 -08:00
Ray Smith 9d40f0117e Added ability to load/save a complete model file, including tokenizer.
PiperOrigin-RevId: 707914366
2024-12-19 07:59:41 -08:00
The gemma.cpp Authors 5bc356f18f Internal change
PiperOrigin-RevId: 707268913
2024-12-17 15:15:57 -08:00
Ray Smith 6254f2e5ca Removed duplicated tensor sizes from weights.h by changing the constructor used for MatPtrT
PiperOrigin-RevId: 705085054
2024-12-11 06:30:28 -08:00
The gemma.cpp Authors 66bb435121 No public description
PiperOrigin-RevId: 704178245
2024-12-09 00:49:36 -08:00
Jan Wassenberg 6a34e9c547 Print cache info and update Highway version for that
PiperOrigin-RevId: 702318451
2024-12-03 06:31:52 -08:00
Jan Wassenberg f74d496879 Threading/infra improvements.
* Add Parallelize*Range helpers and partitioning helpers
* Refactor Pinning class, store original affinity (required to construct another NestedPools after pinning happened)

Compress:
* prevent Compress printing stats in tests
* zero-pad tensors

Matmul:
* add matmul_unit_test (TODO) and bench_matmul
* matmul_test: change norm to row vectors (that is what is added) and include bf16 rounding error
* Prepare for L2/L3 retrieval
PiperOrigin-RevId: 700603811
2024-11-27 01:12:00 -08:00
Stanko Novakovic 109a4d9f85 Add a simple benchmark for batching.
This is a simple Gemma benchmark with a fixed batch size of 32.

PiperOrigin-RevId: 698843573
2024-11-21 10:59:49 -08:00
Ray Smith 73640d2521 Added tensor_index as a single source of truth on tensor shapes/sources and transformations
PiperOrigin-RevId: 697903886
2024-11-19 00:25:39 -08:00
Jan Wassenberg 868b01601f Simpler MatMul interface, vocab types, Tristate for use_spinning
Add Extents2D, Range2D vocab types
Matmul uses ConstMat for inputs and RowPtr for output
Move RowVectorBatch to basics.h
Separate threading.cc
Fix topology string: report cores not LPs, and #HT
Move QStride/IsMHA into LayerConfig
ImageTokens does not require make_unique.
matmul_test: no longer require template args
PiperOrigin-RevId: 692963605
2024-11-04 07:48:29 -08:00
Daniel Keysers 583bd93e9a Factor out addition of ViTConfig to a ModelConfig.
Use ModelConfig values for ImageTokens.
Output timing info for image token generation.
Add a method to copy image data into Image class directly.
Minor changes: pipe ModelTraining to more places.

PiperOrigin-RevId: 690572283
2024-10-28 05:29:33 -07:00
Jan Wassenberg 02ce1e344f Use NestedPools, add NUMA infra
Improved threading.h, fix thread counts for single package/cluster systems
Temporarily forces to a single socket. Prefill 29.28 tps, decode 6.92.

Also fix benchmarks.cc build, update tensor allocator to Allocator

PiperOrigin-RevId: 687307167
2024-10-18 08:11:18 -07:00
Ray Smith 0d68555f87 Eliminated TConfig.
Changed CompressedLayer and CompressedWeights to be constructed with an instance of a LayerConfig and WeightsConfig respectively.
Added CompressedModel to remove ByteStorageT and get rid of most of the type casting, as well as allowing the default destructor to be used and work properly.
Adjusted WeightsWrapper and ForwardLayer etc to match.
The only remaining template arg is the weight type.
This enables all the instantiations to be deleted, apart from one per type.
It also enables (but not yet done) the config to be stored in the blob file instead of having to be specified separately.
Reduces the size of the gemma_lib and weights shared libraries by a factor of 4.3 and 3.2 respectively.

PiperOrigin-RevId: 686870060
2024-10-17 05:04:22 -07:00
The gemma.cpp Authors dfda53e634 Benchmark gemma.cpp with different length inputs.
PiperOrigin-RevId: 684607945
2024-10-10 15:59:26 -07:00
Jan Wassenberg 6ab3ff5bde Minor cleanup, Windows+Bazel build fixes
add app.h comment
compress-inl: remove unused typedef
gemma-inl: add missing HWY_ATTR and cast
separate sum-inl.h and basics.h headers
replace more hwy::bfloat16_t with BF16
update include pragmas
update dot_test thresholds
update Highway version in Bazel for HWY_RCAST_ALIGNED fix
PiperOrigin-RevId: 684464326
2024-10-10 09:05:06 -07:00
Ray Smith 85958f5fd3 Added MatPtr/MatPtrT/MatStorageT/MatStorage as a dynamically-sized replacement for CompressedArray.
Definition of array size is moved to the constructor.
Allocation is separate and parallelized.
All users of weights_raw.h migrated to CompressedWeights and weights_raw.h deleted.
Replaced all previous ForEachTensor functions with a single unified function.

PiperOrigin-RevId: 684451604
2024-10-10 08:22:30 -07:00
Jan Wassenberg 2c28b18eb0 Add NestedPools: one per socket/cluster
Use in dot_test
app.h: add new flags and rename num_threads to max_threads
matmul: Parallelize MatMulSlow and enable spinning, more large/fewer medium test cases
PiperOrigin-RevId: 683216386
2024-10-07 09:40:19 -07:00
Krzysztof Ostrowski b3239bf509 Internal change.
PiperOrigin-RevId: 681530185
2024-10-02 11:33:06 -07:00
Jan Wassenberg 2d14d796e3 1.09x decode speedup for topk=1/temp0: fuse softmax and sample
PiperOrigin-RevId: 680589099
2024-09-30 08:37:41 -07:00
Daniel Keysers f8835fe4a4 Add support for PaliGemma Vision-LM (224x224) to gemma.cpp
See https://arxiv.org/abs/2407.07726 for a description of the model.
Because PaliGemma operates as a prefix-LM on the image+prompt, add support for that.

PiperOrigin-RevId: 677841119
2024-09-23 10:09:38 -07:00
Jan Wassenberg c6c10e0a53 Fix topology display for platforms where it fails (Apple)
PiperOrigin-RevId: 677800053
2024-09-23 08:14:54 -07:00
Jan Wassenberg 8c0a8834c1 Major compression update, arbitrary-len unpack + new Dot
Compression:
* Implement {any packed} x {bf16, f32} 'Load2' and DecompressAndZeroPad
* New compression test for all packed formats, add to GEMMA_TEST_FILES, remove from sfp/nuq_test
* Decompress->DecompressAndZeroPad, use PackedSpan for args with bounds checking
* NUQ: support arbitrary-length enc/dec
* New compression/shared, remove sfp.h and nuq.h
* Move Store2 into Traits and provide Compress2 wrapper
* Remove unused Decompress()-with-pool overload
* Simplify CompressedArrayLen, rename to CompressedArrayElements
* Remove unused DistortionStats b_l1_

Misc:
* Add compensated and Kahan dot, support any length
* Use same Dot function everywhere
* Move exact arithmetic functions into fp_arith
* use FloatPtr and MatPtr typedefs in tests; less stack usage
* Rename args to packed/raw
* Remove Traits::Name, instead TypeName<T>()
* Move kMaxSFP and kClusters/kGroupSize into Sfp/NuqStream
PiperOrigin-RevId: 672868468
2024-09-10 02:22:19 -07:00
Jan Wassenberg c29e9752c7 Refactor/cleanup, remove even_odd
* New compression/shared.h, remove sfp.h
* Remove unused DistortionStats b_l1_
* Move exact arithmetic functions into fp_arith
* Remove even_odd optimization for MatVec (mostly unused)
* use BF16 typedef more widely
* Add kMaxSFP constant

PiperOrigin-RevId: 670996386
2024-09-04 09:25:13 -07:00
Jan Wassenberg 4033ed9e78 Avoid duplication of RMSNorm, support all activation/weight types
Add test for RMSNorm
Rename VectorizedRopeAndMulBy -> RopeAndMulBy

Move test_util to util/

PiperOrigin-RevId: 668332927
2024-08-28 01:26:55 -07:00
Jan Wassenberg 2308514e5a Experiment with compensated dot product.
ULP difference vs exact is 0..1, vs 200-5000 for previous.
Runtime overhead is 2.5-4x for f32 input.

PiperOrigin-RevId: 668084019
2024-08-27 12:05:35 -07:00
Apoorv Reddy c6eb3b6f0d VectorizedRopeAndMulBy.
~8x reduction (tested on few prompts) in Rope.
~3.8% prefill latency improvement.
~2.6% decode latency improvement.

PiperOrigin-RevId: 664650108
2024-08-18 23:17:01 -07:00
Jan Wassenberg 301dc8067a Major MatMul update, 1.9-2.3x speedup on Zen4 via bf16 mul
Supports converting all weight/activation formats to native MulT (bf16/f32)

Also:
- ConstMat/MutableMat for const correctness
- Move RowVectorBatch to allocator.h so it can be used from Matmul
- Add matmul.h so MatMulEnv can be used from Activations
- Remove kMaxThreads, detect from PerClusterPools
- Build fix: -inl.h files must be textual_hdrs, and highway.h should precede -inl.h

```
zen4 new
64, 24576, 3072, add=0, MatTA=bf16, MatTB=sfp:   616.6 GFLOPS.
64, 3072, 24576, add=0, MatTA=bf16, MatTB=sfp:   460.7 GFLOPS.
64, 24576, 3072, add=0, MatTA=f32, MatTB=sfp:    598.6 GFLOPS.
64, 3072, 24576, add=0, MatTA=f32, MatTB=sfp:    435.6 GFLOPS.

zen4 old
64, 24576, 3072, add=0, MatTA=f32, MatTB=sfp:    257.5 GFLOPS.
64, 3072, 24576, add=0, MatTA=f32, MatTB=sfp:    231.9 GFLOPS.
```

PiperOrigin-RevId: 663729812
2024-08-16 07:52:20 -07:00
RangerUFO ea72575e56 Fix build issues when tests are enabled 2024-08-12 18:50:23 +02:00
Jan Wassenberg 2ebbe4076f 1.03-1.08x decode speedup: precompute Rope theta, fuse
Split attention into functions, move into class.
Fuse Rope and MulBy, allow non-in-place version to avoid copy from q to KV.
Sink if() into MaybeLogitsSoftCap.

PiperOrigin-RevId: 661168418
2024-08-09 01:23:24 -07:00
Jan Wassenberg 5e433e774a 1.1x prefill speedup, revamp threading in preparation for hierarchical parallelism.
Limit thread counts to detected. Add max_clusters arg.
Update detection logic to check for smt0 - previously we pinned to some siblings.

PiperOrigin-RevId: 659755311
2024-08-05 18:50:09 -07:00
Phil Culliton 1982a6ba00 Internal change
PiperOrigin-RevId: 657831926
2024-07-30 20:24:54 -07:00
Jan Wassenberg a24eda8d02 Split matmul into matvec; add large matrix benchmark
Rename var names to row/col for more clarity.
Better estimate error tolerance via max abs col sum.

PiperOrigin-RevId: 657601791
2024-07-30 08:29:11 -07:00
Jan Wassenberg aaf51898b6 Major revamp #2 of Prefill: fix token order, parallel for multi-query
- Allocate only the required KV caches and activation batch size
- Add flags for batch sizes
- Const-correct interface: Span of const int.
- Also clean up the KVCache arg to a span.
- Move kPrefillBatchSize into RuntimeConfig and remove related global constants.

PiperOrigin-RevId: 655893197
2024-07-25 03:28:55 -07:00
Jan Wassenberg 85cac13fb1 Split up ops.h into ops/ops-inl and matmul-inl
PiperOrigin-RevId: 654068303
2024-07-19 11:21:48 -07:00
Jan Wassenberg 12016d31c3 Major Prefill/Generate cleanup, 1.3x Prefill speedup
This fixes TTFT, which was not including prefill.

PiperOrigin-RevId: 653690626
2024-07-18 11:16:46 -07:00
Andrey Vlasov 3e92088595 Remove allocation from GEMM_4x4_Tile when decoding compressed weights by implementing
SfpCodec::Dec2F and ComressTraits<T>::Decompress2 for all supported types. It also allows to remove one of the specializations of GEMM_4x4_Tile, handling compressed MatB with one function. As before even when MatA is bf16 it is using 32-bit registers for computations.

Measurements for a 2b-it sfp-encoded model on a  AMD Ryzen Threadripper PRO 3945WX 12-Cores:
baseline:
```
32.6254 prefill tokens / sec
8.91429 tokens / sec
115 milliseconds time to first token
```
this change:
```
54.3045 prefill tokens / sec
16.8191 tokens / sec
56 milliseconds time to first token
```
PiperOrigin-RevId: 651369694
2024-07-11 05:13:39 -07:00
Daniel Keysers cf76f0a401 Update gemma_test to also pass for the v1.1. models.
Make it an error if the model cannot be loaded.

PiperOrigin-RevId: 650232602
2024-07-08 06:45:37 -07:00
Jan Wassenberg cbb67b4ee0 Move benchmark_helper to evals/, weights_raw to compression/.
PiperOrigin-RevId: 650155983
2024-07-08 01:13:23 -07:00
Jan Wassenberg f823371691 Cleanup: move util/compress and convert_weights to compression/
Also remove unused models/, lint convert_weights

PiperOrigin-RevId: 649613088
2024-07-05 04:16:52 -07:00
Jan Wassenberg 118e802b00 Fix gemma_test - moved to evals/.
PiperOrigin-RevId: 649338633
2024-07-04 02:04:05 -07:00
Jan Wassenberg c7c3daa624 7x compile time speedup: shard gemma.cc
Use overloaded functions defined in gemma/instantiations.
Also split out activations.h.

PiperOrigin-RevId: 649053122
2024-07-03 06:35:04 -07:00
Jan Wassenberg 09a7e75ead Prep for sharding gemma.cc: split into kv_cache, tokenizer.
Move activations.h to backprop/ to make space for another activations.h.

PiperOrigin-RevId: 648744500
2024-07-02 09:31:06 -07:00
Jan Wassenberg af8eb2fde3 Declutter gemma/ directory, move binaries to evals/ and util/.
PiperOrigin-RevId: 648400795
2024-07-01 09:51:04 -07:00
Jan Wassenberg e588a7f45d Add config for att/final cap, skip max-subtract. Fixes #278
Also update includes/deps for backprop/.

PiperOrigin-RevId: 648399222
2024-07-01 09:45:26 -07:00
The gemma.cpp Authors da7507e6f0 Add prompt batching to Gemma.cpp.
This CL adds a new function to Gemma that allows for batching of multiple prompts. The function takes a vector of prompts and returns a vector of responses. The prompts are processed in parallel, and the responses are returned in the same order as the prompts.

PiperOrigin-RevId: 648367559
2024-07-01 07:51:31 -07:00
Paul Chang aa57fc3952 Remove unused BUILD dependency
PiperOrigin-RevId: 646519547
2024-06-25 10:12:13 -07:00
Jan Wassenberg 7d0720675f Move raw_weights into separate header, used mainly by compress_weights.
Fix warnings in backprop/* (include)

PiperOrigin-RevId: 643983136
2024-06-17 06:17:02 -07:00
Jan Wassenberg d3c6a45b59 Major duplicated code reduction in test/benchmarks
Helper functions to tokenize/wrap
Move LayersOutputFunc into RuntimeConfig
AcceptFunc passes the probability
Implement StringFromType using the parser, and verify results match

PiperOrigin-RevId: 643255119
2024-06-14 00:16:25 -07:00
Andrey Vlasov bf78a065e1 Make gemma/ops_test `large`.
PiperOrigin-RevId: 642923146
2024-06-13 03:33:46 -07:00
Ray Smith bdf33c7008 Updated benchmarks.cc to recent changes to Gemma API.
PiperOrigin-RevId: 642285902
2024-06-11 08:55:40 -07:00
Daniel Keysers 8ec8eef524 Add internal initialization code to debug_prompt.
PiperOrigin-RevId: 642276350
2024-06-11 08:19:38 -07:00
The gemma.cpp Authors 57d0ea95d0 Add buildcleaner: keep pragma to a dep in ops_test build rule and run build_cleaner.
PiperOrigin-RevId: 642275845
2024-06-11 08:17:47 -07:00
Jan Wassenberg 3e2396f98c Use Loader/AppArgs to construct gemma_test model, simplify AcceptFunc
accept_token: allow default, check if empty when using
allow mixing sample_func and stream_func, call the latter after the former
Also fix missing includes/deps.
PiperOrigin-RevId: 642240012
2024-06-11 05:53:10 -07:00
Jan Wassenberg c7f5e93136 Update benchmark with internal init
PiperOrigin-RevId: 641929308
2024-06-10 09:35:16 -07:00
Jan Wassenberg c1c6714ad4 Internal experiment
PiperOrigin-RevId: 641915024
2024-06-10 08:46:10 -07:00
Jan Wassenberg 95fd7263ae Add missing test deps
PiperOrigin-RevId: 641880024
2024-06-10 06:22:07 -07:00
The gemma.cpp Authors 020db5a67d No public description
PiperOrigin-RevId: 641816837
2024-06-10 01:12:42 -07:00
Jan Wassenberg f9b390b134 Support all weight types in a single binary.
This changes the command line flags, but the default value retains the previous behavior.

Also add a CreateGemma helper to enable extra args without interface changes.

PiperOrigin-RevId: 641266411
2024-06-07 09:04:45 -07:00
Zoltan Szabadka 465998d25a Add support for custom sampling function to runtime config.
With this addition the ComputeCrossEntropy function can be moved
to its own library, because now we can compute it using only the
public API functions from gemma.h
2024-06-07 11:45:07 +00:00
Copybara-Service f7ac7092d6 Merge pull request #212 from szabadka:adam2
PiperOrigin-RevId: 641182573
2024-06-07 02:25:18 -07:00
Jan Wassenberg 57c2cd8b52 Simplifications: remove GemmaInterface and GemmaImpl
Split common and weights into separate lib
Remove common-inl (does not have to be SIMD code), activations.cc
Centralize switch(Model) to avoid duplication
Move CompressWeightsT to compress_weights.cc
Move LoadWeights to weights.cc

PiperOrigin-RevId: 640869202
2024-06-06 05:54:21 -07:00
Zelalem Aweke 9e213b3d96 Use system topology to pin threads across clusters.
PiperOrigin-RevId: 640151974
2024-06-04 07:50:32 -07:00
Zoltan Szabadka be1d58d4fa Fix bazel build 2024-06-04 11:13:19 +00:00
Apoorv Reddy 7f4b85d00b Add MMLU eval to github
PiperOrigin-RevId: 635495178
2024-05-20 10:20:53 -07:00
Apoorv Reddy f1eab987d8 Store tokens/sec in auxiliary struct TimingInfo.
PiperOrigin-RevId: 633108908
2024-05-13 00:04:19 -07:00
Jan Wassenberg 12fb2f05cf Add per-thread even_odd storage for #166.
Also inline ProjQ and ProjKV lambdas,
add missing includes/deps for ops_test.

PiperOrigin-RevId: 629460608
2024-04-30 10:42:23 -07:00
Jan Wassenberg 7a12e29027 Add error-checking for py binding, add missing include+hwasan check
PiperOrigin-RevId: 628453112
2024-04-26 10:59:41 -07:00
Jan Wassenberg e9a0caed87 Further improve IO, enable multiple backends without -D.
Move Path into io.h and use for opening files.
Removes dependency of gemma_lib on args.
Separate Windows codepath instead of emulating POSIX functions.

Plus lint fixes.

PiperOrigin-RevId: 626279004
2024-04-19 00:40:29 -07:00
Jan Wassenberg a8ceb75f43 Improved IO abstraction layer
Move to unique_ptr-like File class.
Move `if OS_WIN` into wrapper functions.
exists -> Exists.

PiperOrigin-RevId: 625923056
2024-04-17 23:15:07 -07:00
Andrey Mikhaylov 4ef3da733a Fixed minor things and added comments. 2024-04-12 15:39:16 +00:00
Andrey Mikhaylov 03284d752e Added layers output functionality to gemma and a binary debug_output to save the outputs to a json file. 2024-04-12 15:39:16 +00:00
Jan Wassenberg a982ec1287 Move code to gemma/ so we can remove error-prone copybara: comments.
Also fix includes and Lint warnings.

PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00
Jan Wassenberg 4326249d0a Fix includes
PiperOrigin-RevId: 622456877
2024-04-06 09:27:09 -07:00
Jan Wassenberg a3a0f78fda Merge pull request #131 from veluca93:benchmark-and-test
PiperOrigin-RevId: 622452794
2024-04-06 18:06:03 +02:00
Luca Versari 5862d1f995 Add a benchmark and additional tests.
Also add a script to help running sanitizer builds, and do some cleanup.

Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Sami Boukortt <sboukortt@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 12:54:52 +02:00
Zoltan Szabadka b670d43e4f Add standalone tool to compress weights.
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
2024-04-03 14:54:08 +00:00
Jan Wassenberg c1d3c3284c Add ops_test to BUILD, rename transformer_ops->ops, fix includes.
Also fix copybara. Refs #105

PiperOrigin-RevId: 619157071
2024-03-26 05:37:31 -07:00
Jan Wassenberg ba86c8d590 Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-21 04:19:02 +01:00
Eric Ye 89be4c3de8 No public description
PiperOrigin-RevId: 617315030
2024-03-21 04:18:36 +01:00
Jan Wassenberg 06cea2bcdb Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-20 23:37:39 +01:00
Eric Ye ffd02c59ad No public description
PiperOrigin-RevId: 617315030
2024-03-20 23:37:12 +01:00
Jan Wassenberg 7d5364bb80 Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-20 11:31:59 -07:00
Copybara-Service 0221956b2e Merge pull request #87 from google:refactor-tidy
PiperOrigin-RevId: 615204427
2024-03-12 16:10:47 -07:00
Austin Huang a9aa63fd2e Fix bazel build attempt #2 (remove //base through automation change).
https://github.com/google/gemma.cpp/actions/runs/8252325981/job/22571549312

PiperOrigin-RevId: 615097246
2024-03-12 10:43:26 -07:00
Austin Huang 9345b0aed5 Fix bazel build failure:
https://github.com/google/gemma.cpp/actions/runs/8252325981/job/22571549312

PiperOrigin-RevId: 615073081
2024-03-12 09:41:03 -07:00
Copybara-Service ccd055e06b Merge pull request #82 from google:examples
PiperOrigin-RevId: 615066980
2024-03-12 09:24:24 -07:00
Jan Wassenberg bb9b023502 Support Bazel builds. Fixes #16
Also fix nuq/sfp-inl: warning, cast, and disable SCALAR

PiperOrigin-RevId: 612704056
2024-03-04 22:07:25 -08:00
Dan Zheng 1243be71c4 Copybara import of the project:
--
e0179bad83 by Dan Zheng <danielzheng@google.com>:

Rename BUILD to BUILD.bazel.

This fixes an error on macOS due to `build` and `BUILD` having conflicting names.

--
74b27074e1 by Dan Zheng <danielzheng@google.com>:

Enable macos-latest in GitHub Actions CI.

--
c08de58e6a by Dan Zheng <danielzheng@google.com>:

Fix concurrency key in GitHub Actions.

Use matrix configuration in concurrency key.

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gemma.cpp/pull/36 from dan-zheng:rename-build-bzl b4b978f02b
PiperOrigin-RevId: 610156681
2024-02-25 04:34:15 -08:00
Silvio Traversaro 696597383c Copybara import of the project:
--
19694e1f2e by Silvio Traversaro <silvio@traversaro.it>:

Do not pass explicitly -O2 flag to compiler in Release build

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gemma.cpp/pull/3 from traversaro:patch-1 19694e1f2e
PiperOrigin-RevId: 610096914
2024-02-24 20:41:33 -08:00
Dan Zheng 3af439621e
Rename BUILD to BUILD.bazel. (#36)
* Rename BUILD to BUILD.bazel.

This fixes an error on macOS due to `build` and `BUILD` having conflicting names.

* Enable macos-latest in GitHub Actions CI.

* Fix concurrency key in GitHub Actions.

Use matrix configuration in concurrency key.
2024-02-24 14:52:59 -08:00
Jan Wassenberg e7a6f44ba5 Sync infra
PiperOrigin-RevId: 610035407
2024-02-24 13:00:00 -08:00
Renamed from BUILD (Browse further)