History

Jan Wassenberg 8d0882b966 Huge refactor of weight handling and model loading. Weight handling: - new ModelStore2 supports both pre-2025 multi-file and single-file formats - simpler ForEachTensor with TensorArgs - tensors are constructed with their full suffixed name I/O: - support mmap and stride - Simplified SbsWriter, single insert(); add SbsReader Misc: - kMockTokenizer: allow creating with unavailable tokenizer - configs.h: Simpler enum validity checks via kSentinel - matmul.h: remove unused enable_bind (now in allocator.h) - tensor_info: single TensorInfoRegistry class, rename from tensor_index.h Frontends: - Replace Allocate/CreateGemma with ctor(LoaderArgs, MatMulEnv&) - Deduce model/weight type, remove --model and parsing - Replace most common.h includes with configs.h - Remove --compressed_weights, use --weights instead - Remove ModelInfo, replaced by ModelConfig. Backprop: - Reduce max loss, remove backward_scalar_test (timeout) - Update thresholds because new RandInit changes rng eval order and thus numerics PiperOrigin-RevId: 755317484		2025-05-06 04:44:21 -07:00
..
build	[WIP] decouple GemmaImpl from CLI args	2024-03-06 15:06:41 -05:00
BUILD.bazel	Major refactor of allocator/args:	2025-04-10 01:29:54 -07:00
CMakeLists.txt	Refactor Gemma ctor and improve pool NUMA support	2025-03-14 10:19:00 -07:00
README.md	Major refactor of allocator/args:	2025-04-10 01:29:54 -07:00
run.cc	Huge refactor of weight handling and model loading.	2025-05-06 04:44:21 -07:00

README.md

Hello World Example

This is a minimal/template project for using gemma.cpp as a library. Instead of an interactive interface, it sets up the model state and generates text for a single hard coded prompt.

Build steps are similar to the main gemma executable. For now only cmake/make is available for builds (PRs welcome for other build options).

First use cmake to configure the project, starting from the hello_world example directory (gemma.cpp/examples/hello_world):

cmake -B build

This sets up a build configuration in gemma.cpp/examples/hello_world/build. Note that this fetches libgemma from a git commit hash on github. Alternatively if you want to build using the local version of gemma.cpp use:

cmake -B build -DBUILD_MODE=local

Make sure you delete the contents of the build directory before changing configurations.

Then use make to build the project:

cd build
make hello_world

As with the top-level gemma.cpp project you can use the make commands -j flag to use parallel threads for faster builds.

From inside the gemma.cpp/examples/hello_world/build directory, there should be a hello_world executable. You can run it with the same 3 model arguments as gemma.cpp specifying the tokenizer, compressed weights file, and model type, for example:

./hello_world --tokenizer tokenizer.spm --weights 2b-it-sfp.sbs --model 2b-it

Should print a greeting to the terminal:

"Hello, world! It's a pleasure to greet you all. May your day be filled with joy, peace, and all the things that make your heart soar.

For a demonstration of constrained decoding, add the --reject flag followed by a list of token IDs (note that it must be the last flag, since it consumes every subsequent argument). For example, to reject variations of the word "greeting", run:

./hello_world [...] --reject 32338 42360 78107 106837 132832 143859 154230 190205