From 4154f5a91006bf4bec0cf9ae913dfb73249e00ae Mon Sep 17 00:00:00 2001 From: Jan Wassenberg Date: Tue, 6 Aug 2024 01:43:49 -0700 Subject: [PATCH] Document Gemma 2 model names PiperOrigin-RevId: 659858832 --- README.md | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index 4425cc4..f988ed1 100644 --- a/README.md +++ b/README.md @@ -77,18 +77,21 @@ winget install --id Microsoft.VisualStudio.2022.BuildTools --force --override "- ### Step 1: Obtain model weights and tokenizer from Kaggle or Hugging Face Hub -Visit [the Gemma model page on -Kaggle](https://www.kaggle.com/models/google/gemma/frameworks/gemmaCpp) and select `Model Variations -|> Gemma C++`. On this tab, the `Variation` dropdown includes the options below. -Note bfloat16 weights are higher fidelity, while 8-bit switched floating point -weights enable faster inference. In general, we recommend starting with the -`-sfp` checkpoints. +Visit the +[Kaggle page for Gemma](https://www.kaggle.com/models/google/gemma/frameworks/gemmaCpp), +or [Gemma-2](https://www.kaggle.com/models/google/gemma-2/gemmaCpp), and select +`Model Variations |> Gemma C++`. -Alternatively, visit the [gemma.cpp](https://huggingface.co/models?other=gemma.cpp) -models on the Hugging Face Hub. First go the the model repository of the model of interest -(see recommendations below). Then, click the `Files and versions` tab and download the -model and tokenizer files. For programmatic downloading, if you have `huggingface_hub` -installed, you can also download by running: +On this tab, the `Variation` dropdown includes the options below. Note bfloat16 +weights are higher fidelity, while 8-bit switched floating point weights enable +faster inference. In general, we recommend starting with the `-sfp` checkpoints. + +Alternatively, visit the +[gemma.cpp](https://huggingface.co/models?other=gemma.cpp) models on the Hugging +Face Hub. First go the the model repository of the model of interest (see +recommendations below). Then, click the `Files and versions` tab and download +the model and tokenizer files. For programmatic downloading, if you have +`huggingface_hub` installed, you can also download by running: ``` huggingface-cli login # Just the first time @@ -117,6 +120,9 @@ huggingface-cli download google/gemma-2b-sfp-cpp --local-dir build/ > **Important**: We strongly recommend starting off with the `2b-it-sfp` model to > get up and running. +Gemma 2 models are named `gemma2-2b-it` for 2B and `9b-it` or `27b-it`. See the +`kModelFlags` definition in `common.cc`. + ### Step 2: Extract Files If you downloaded the models from Hugging Face, skip to step 3.