* mimov2: convert ok * rename mimov2 --> mimo2 * fix conversion * runnable not incorrect * use sink * add_sliding_window_pattern * add swa and per-layer n_head_kv * correct params * somewhat working * correct gating func * nits * mimo2: wire RMS eps + MoE bias + converter guards * add co-author Co-authored-by: Aaryan-Kapoor <Aaryan-Kapoor@users.noreply.github.com> * use add_rope_freq_base_swa --------- Co-authored-by: Aaryan Kapoor <aaryankapoor2006@gmail.com> Co-authored-by: Aaryan-Kapoor <Aaryan-Kapoor@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| scripts | ||
| __init__.py | ||
| constants.py | ||
| gguf.py | ||
| gguf_reader.py | ||
| gguf_writer.py | ||
| lazy.py | ||
| metadata.py | ||
| py.typed | ||
| quants.py | ||
| tensor_mapping.py | ||
| utility.py | ||
| vocab.py | ||