convert : use n_groups instead of hardcoded values in reshape (#18929)

* convert : use n_groups instead of hardcoded values in reshape

This commit modifies the conversion script for NemotronHModel to use
the 'n_groups' hyperparameter, and allow Python to calculate the the
last dimension, using -1, when reshaping the 'mixer.norm.weight' tensor.

* use self.n_group instead of self.hparams["n_groups"]
This commit is contained in:
Daniel Bevenius 2026-01-20 06:55:24 +01:00 committed by GitHub
parent 6df686bee6
commit 7dee9ff59a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 1 additions and 1 deletions

View File

@ -9212,7 +9212,7 @@ class NemotronHModel(GraniteHybridModel):
return [(mapped_name, reshaped_data)]
if name.endswith("mixer.norm.weight"):
reshaped_data = data_torch.reshape(8, 512)
reshaped_data = data_torch.reshape(self.n_group, -1)
mapped_name = self.map_tensor_name(name)
return [(mapped_name, reshaped_data)]