Skip to content

Conversation

@Aznix07
Copy link

@Aznix07 Aznix07 commented Nov 4, 2025

What does this PR do?

Fixes a KeyError: 'rope_parameters_factor' in the GPT-OSS weight conversion script that was preventing users from converting GPT-OSS model weights to HuggingFace format.

Fixes #42003

Problem

After PR #39847 standardized RoPE parameter handling across models, the GPT-OSS conversion script was still trying to access individual parameter keys that no longer exist in the original config:

# Old code - these keys don't exist anymore
"beta_fast": float(original_config.pop("rope_ntk_beta")), 
"beta_slow": float(original_config.pop("rope_ntk_alpha")),
"factor": float(original_config.pop("rope_parameters_factor")),

Reproduction

huggingface-cli download openai/gpt-oss-20b
python src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py \
    --input_dir path/to/original \
    --output_dir ./output

# KeyError: 'rope_parameters_factor'

Solution

Added proper handling for both config formats:

# Handle both old and new config formats for rope_parameters
    if "rope_parameters" in original_config:
        # New format: rope_parameters already exists as a dict
        rope_parameters = original_config.pop("rope_parameters")
        # Ensure rope_type is set
        if "rope_type" not in rope_parameters:
            rope_parameters["rope_type"] = "yarn"
    else:
        # Old format: construct rope_parameters from individual keys
        # Fallback to default values if rop_parameters is missing
        rope_parameters = {
            # "beta_fast": float(original_config.pop("rope_ntk_beta")),
            # "beta_slow": float(original_config.pop("rope_ntk_alpha")),
            # "factor": float(original_config.pop("rope_parameters_factor")),
            "rope_type": "yarn",
            "truncate": False,
            "original_max_position_embeddings": 4096,
            "beta_fast": 32.0,
            "beta_slow": 1.0,
            "factor": 32.0,
        }

Testing

✅ Verified conversion script no longer throws KeyError
✅ Tested fallback logic with both config formats
✅ Default values match configuration_gpt_oss.py

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the [contributor guideline, Pull Request section?
  • Was this discussed/approved via a Github issue or the [forum]? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? (N/A - bug fix in conversion script)
  • Did you write any new necessary tests? (Added manual verification test)

Who can review?

@zucchini-nlp

@github-actions
Copy link
Contributor

github-actions bot commented Nov 4, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: gpt_oss

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am quite lost here since I didn't work with GPT-OSS. To make sure, the original_config is an HF config object iiuc and we need to make sure it is Yarn if no params are saved in config?

Comment on lines +172 to +174
# "beta_fast": float(original_config.pop("rope_ntk_beta")),
# "beta_slow": float(original_config.pop("rope_ntk_alpha")),
# "factor": float(original_config.pop("rope_parameters_factor")),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we hardcode the values safely? I think it was better to obtain from the config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

KeyError: 'rope_parameters_factor' when using convert_gpt_oss_weights_to_hf.py for openai/gpt-oss-20b

2 participants