mirror of https://github.com/google/gemma.cpp.git
whitespace cleanup
This commit is contained in:
parent
d37f9c3604
commit
060c8862dd
|
|
@ -73,7 +73,7 @@ The implementation code is roughly split into 4 layers, from high to low level:
|
||||||
|
|
||||||
Besides these layers, supporting utilities are:
|
Besides these layers, supporting utilities are:
|
||||||
|
|
||||||
- `compression/` - model compression operations. The 8-bit switched floating
|
- `compression/` - model compression operations. The 8-bit switched floating
|
||||||
point model conversion is here.
|
point model conversion is here.
|
||||||
- `util/` - command line argument handling and any other utilities.
|
- `util/` - command line argument handling and any other utilities.
|
||||||
|
|
||||||
|
|
@ -85,17 +85,17 @@ before finalizing PR for submission.
|
||||||
|
|
||||||
## Compile-Time Flags (Advanced)
|
## Compile-Time Flags (Advanced)
|
||||||
|
|
||||||
There are several compile-time flags to be aware of (note these may or may not
|
There are several compile-time flags to be aware of (note these may or may not
|
||||||
be exposed to the build system):
|
be exposed to the build system):
|
||||||
|
|
||||||
- `GEMMA_WEIGHT_T` : Sets the level of compression for weights (surfaced as
|
- `GEMMA_WEIGHT_T` : Sets the level of compression for weights (surfaced as
|
||||||
WEIGHT_TYPE in CMakeLists.txt). Currently this should be set to `SfpStream`
|
WEIGHT_TYPE in CMakeLists.txt). Currently this should be set to `SfpStream`
|
||||||
(default, if no flag is specified) for 8-bit SFP, or `hwy::bfloat16_t` to
|
(default, if no flag is specified) for 8-bit SFP, or `hwy::bfloat16_t` to
|
||||||
enable for higher-fidelity (but slower) bfloat16 support. This is defined in
|
enable for higher-fidelity (but slower) bfloat16 support. This is defined in
|
||||||
`gemma.h`.
|
`gemma.h`.
|
||||||
- `GEMMA_MAX_SEQ_LEN` : Sets maximum sequence length to preallocate for the KV
|
- `GEMMA_MAX_SEQ_LEN` : Sets maximum sequence length to preallocate for the KV
|
||||||
Cache. The default is 4096 tokens but can be overridden. This is not exposed
|
Cache. The default is 4096 tokens but can be overridden. This is not exposed
|
||||||
through `CMakeLists.txt` yet.
|
through `CMakeLists.txt` yet.
|
||||||
|
|
||||||
In the medium term both of these will likely be deprecated in favor of handling
|
In the medium term both of these will likely be deprecated in favor of handling
|
||||||
options at runtime - allowing for multiple weight compression schemes in a single
|
options at runtime - allowing for multiple weight compression schemes in a single
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue