Its FFN size is 5460 which is not convenient. The offending tensors are kept in F16, which makes the final model 5.01 bpw. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| constants.py | ||
| gguf.py | ||
| gguf_reader.py | ||
| gguf_writer.py | ||
| lazy.py | ||
| py.typed | ||
| quants.py | ||
| tensor_mapping.py | ||
| vocab.py | ||
Its FFN size is 5460 which is not convenient. The offending tensors are kept in F16, which makes the final model 5.01 bpw. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| constants.py | ||
| gguf.py | ||
| gguf_reader.py | ||
| gguf_writer.py | ||
| lazy.py | ||
| py.typed | ||
| quants.py | ||
| tensor_mapping.py | ||
| vocab.py | ||