mirror of https://github.com/google/gemma.cpp.git
Merge pull request #74 from osanseviero:patch-1
PiperOrigin-RevId: 612937722
This commit is contained in:
commit
c8b9675898
17
README.md
17
README.md
|
|
@ -65,15 +65,26 @@ winget install --id Kitware.CMake
|
||||||
winget install --id Microsoft.VisualStudio.2022.BuildTools --force --override "--passive --wait --add Microsoft.VisualStudio.Workload.VCTools;installRecommended --add Microsoft.VisualStudio.Component.VC.Llvm.Clang --add Microsoft.VisualStudio.Component.VC.Llvm.ClangToolset"
|
winget install --id Microsoft.VisualStudio.2022.BuildTools --force --override "--passive --wait --add Microsoft.VisualStudio.Workload.VCTools;installRecommended --add Microsoft.VisualStudio.Component.VC.Llvm.Clang --add Microsoft.VisualStudio.Component.VC.Llvm.ClangToolset"
|
||||||
```
|
```
|
||||||
|
|
||||||
### Step 1: Obtain model weights and tokenizer from Kaggle
|
### Step 1: Obtain model weights and tokenizer from Kaggle or Hugging Face Hub
|
||||||
|
|
||||||
Visit [the Gemma model page on
|
Visit [the Gemma model page on
|
||||||
Kaggle](https://www.kaggle.com/models/google/gemma) and select `Model Variations
|
Kaggle](https://www.kaggle.com/models/google/gemma/frameworks/gemmaCpp) and select `Model Variations
|
||||||
|> Gemma C++`. On this tab, the `Variation` dropdown includes the options below.
|
|> Gemma C++`. On this tab, the `Variation` dropdown includes the options below.
|
||||||
Note bfloat16 weights are higher fidelity, while 8-bit switched floating point
|
Note bfloat16 weights are higher fidelity, while 8-bit switched floating point
|
||||||
weights enable faster inference. In general, we recommend starting with the
|
weights enable faster inference. In general, we recommend starting with the
|
||||||
`-sfp` checkpoints.
|
`-sfp` checkpoints.
|
||||||
|
|
||||||
|
Alternatively, visit the [gemma.cpp](https://huggingface.co/models?other=gemma.cpp)
|
||||||
|
models on the Hugging Face Hub. First go the the model repository of the model of interest
|
||||||
|
(see recommendations below). Then, click the `Files and versions` tab and download the
|
||||||
|
model and tokenizer files. For programmatic downloading, if you have `huggingface_hub`
|
||||||
|
installed, you can also download by running:
|
||||||
|
|
||||||
|
```
|
||||||
|
huggingface-cli login # Just the first time
|
||||||
|
huggingface-cli download google/gemma-2b-sfp-cpp --local-dir build/
|
||||||
|
```
|
||||||
|
|
||||||
2B instruction-tuned (`it`) and pre-trained (`pt`) models:
|
2B instruction-tuned (`it`) and pre-trained (`pt`) models:
|
||||||
|
|
||||||
| Model name | Description |
|
| Model name | Description |
|
||||||
|
|
@ -98,6 +109,8 @@ weights enable faster inference. In general, we recommend starting with the
|
||||||
|
|
||||||
### Step 2: Extract Files
|
### Step 2: Extract Files
|
||||||
|
|
||||||
|
If you downloaded the models from Hugging Face, skip to step 3.
|
||||||
|
|
||||||
After filling out the consent form, the download should proceed to retrieve a
|
After filling out the consent form, the download should proceed to retrieve a
|
||||||
tar archive file `archive.tar.gz`. Extract files from `archive.tar.gz` (this can
|
tar archive file `archive.tar.gz`. Extract files from `archive.tar.gz` (this can
|
||||||
take a few minutes):
|
take a few minutes):
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue