mirror of https://github.com/google/gemma.cpp.git
whitespace cleanup
This commit is contained in:
parent
d37f9c3604
commit
060c8862dd
|
|
@ -73,7 +73,7 @@ The implementation code is roughly split into 4 layers, from high to low level:
|
|||
|
||||
Besides these layers, supporting utilities are:
|
||||
|
||||
- `compression/` - model compression operations. The 8-bit switched floating
|
||||
- `compression/` - model compression operations. The 8-bit switched floating
|
||||
point model conversion is here.
|
||||
- `util/` - command line argument handling and any other utilities.
|
||||
|
||||
|
|
@ -85,17 +85,17 @@ before finalizing PR for submission.
|
|||
|
||||
## Compile-Time Flags (Advanced)
|
||||
|
||||
There are several compile-time flags to be aware of (note these may or may not
|
||||
There are several compile-time flags to be aware of (note these may or may not
|
||||
be exposed to the build system):
|
||||
|
||||
- `GEMMA_WEIGHT_T` : Sets the level of compression for weights (surfaced as
|
||||
WEIGHT_TYPE in CMakeLists.txt). Currently this should be set to `SfpStream`
|
||||
(default, if no flag is specified) for 8-bit SFP, or `hwy::bfloat16_t` to
|
||||
- `GEMMA_WEIGHT_T` : Sets the level of compression for weights (surfaced as
|
||||
WEIGHT_TYPE in CMakeLists.txt). Currently this should be set to `SfpStream`
|
||||
(default, if no flag is specified) for 8-bit SFP, or `hwy::bfloat16_t` to
|
||||
enable for higher-fidelity (but slower) bfloat16 support. This is defined in
|
||||
`gemma.h`.
|
||||
- `GEMMA_MAX_SEQ_LEN` : Sets maximum sequence length to preallocate for the KV
|
||||
Cache. The default is 4096 tokens but can be overridden. This is not exposed
|
||||
through `CMakeLists.txt` yet.
|
||||
through `CMakeLists.txt` yet.
|
||||
|
||||
In the medium term both of these will likely be deprecated in favor of handling
|
||||
options at runtime - allowing for multiple weight compression schemes in a single
|
||||
|
|
|
|||
Loading…
Reference in New Issue