Skip to content

BUG: fix _fix_chat_template for ChatML templates missing add_generation_prompt (#4150)#4426

Open
kimimgo wants to merge 3 commits intounslothai:mainfrom
kimimgo:fix/chat-template-chatml-4150
Open

BUG: fix _fix_chat_template for ChatML templates missing add_generation_prompt (#4150)#4426
kimimgo wants to merge 3 commits intounslothai:mainfrom
kimimgo:fix/chat-template-chatml-4150

Conversation

@kimimgo
Copy link
Copy Markdown

@kimimgo kimimgo commented Mar 18, 2026

Fixes #4150

Problem

ChatML-style templates (Hermes-3, Magnum-v2, etc.) saved after LoRA training may end
with {% endfor %} and have no {% if add_generation_prompt %} block. This causes
fix_chat_template() to raise:

RuntimeError: Unsloth: The tokenizer `...` does not have a
{% if add_generation_prompt %} for generation purposes.

Root Cause

_fix_chat_template() only patches templates where content ({{ ... }}) follows
{% endfor %}. When the template simply ends after {% endfor %} (empty suffix),
the function returns the template unchanged, and the caller raises RuntimeError.

Fix

Add an elif branch in _fix_chat_template() that detects ChatML templates
(containing <|im_start|>) with an empty suffix after endfor, and appends
the standard ChatML generation prompt block:

{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}

This is a general fix (not model-name-based bypass) that handles any ChatML-style
template missing the generation prompt block.

…on_prompt (unslothai#4150)

ChatML-style templates (Hermes, Magnum, etc.) saved after LoRA
training may end with {% endfor %} and have no
{% if add_generation_prompt %} block. The existing _fix_chat_template
only handled the case where content ({{ ... }}) followed endfor,
not the case where the template simply ended.

Add an elif branch that detects ChatML templates (containing
<|im_start|>) with an empty suffix after endfor, and appends the
standard ChatML generation prompt block.

Fixes unslothai#4150
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical runtime error in the tokenizer utility that prevented certain ChatML-style templates from being correctly processed after LoRA training. By enhancing the template fixing logic, it ensures that all valid ChatML templates, even those missing the generation prompt block, are properly formatted, thereby improving the robustness and compatibility of the tokenizer.

Highlights

  • Bug Fix: Resolved a RuntimeError that occurred when _fix_chat_template() processed ChatML-style templates (e.g., Hermes-3, Magnum-v2) that ended with {% endfor %} but lacked an {% if add_generation_prompt %} block.
  • Root Cause Addressed: The previous implementation of _fix_chat_template() failed to patch templates where content (like {{ ... }}) did not follow {% endfor %} and the suffix was empty, leading to the template being returned unchanged and subsequent errors.
  • Template Patching Logic: Introduced an elif branch in _fix_chat_template() to specifically detect ChatML templates (identified by <|im_start|>) with an empty suffix after endfor. For these cases, the standard ChatML generation prompt block {% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %} is now appended.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug in _fix_chat_template() that caused RuntimeError for ChatML-style templates missing the add_generation_prompt block. The fix adds an elif branch to handle templates with an empty suffix after endfor, appending the standard ChatML generation prompt block. The review focuses on the correctness and maintainability of the added code, ensuring it effectively resolves the issue and integrates well with the existing logic.

Comment on lines +675 to 686
elif after_endfor.strip() == "" and "<|im_start|>" in chat_template:
# GH#4150: ChatML-style templates (Hermes, etc.) that end with
# {% endfor %} and have no add_generation_prompt block.
# Append the standard ChatML generation prompt.
generation_block = (
"{%" + dash + " if add_generation_prompt %}"
"{{ '<|im_start|>assistant\n' }}"
"{%" + dash + " endif %}"
)
chat_template = chat_template[: where + len(chosen_end)] + generation_block

return chat_template
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This elif block appends a generation prompt to ChatML templates that have an empty suffix after endfor. It's important to ensure that this addition doesn't inadvertently affect other types of templates or introduce unintended side effects. Consider adding a more specific check to ensure this logic only applies to the intended ChatML templates to avoid potential issues in the future.

It is also important to ensure that the appended generation_block is correctly formatted and compatible with all ChatML-style templates.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 48119ee39c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +679 to +681
generation_block = (
"{%" + dash + " if add_generation_prompt %}"
"{{ '<|im_start|>assistant\n' }}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Don't hard-code <|im_start|>assistant\n for all ChatML fixes

This new branch assumes every <|im_start|> template should generate with '<|im_start|>assistant\n', but we already ship supported variants that use a different assistant prefix. For example, unsloth/chat_templates.py:614-626 defines Phi-4 with '<|im_start|>assistant<|im_sep|>'; if a saved Phi-4 tokenizer loses its add_generation_prompt block and hits _fix_chat_template(), this patch rewrites it to the wrong protocol and produces malformed prompts at inference time. The fix needs to preserve the model-specific assistant prefix instead of treating all ChatML-style templates as Hermes-style newline prompts.

Useful? React with 👍 / 👎.

…t prefix

Address review feedback: Phi-4 uses <|im_sep|> instead of newline
after role. Extract the separator from the template pattern
(message['role'] + 'SEPARATOR') to generate the correct
assistant prefix for each ChatML variant.
@kimimgo kimimgo force-pushed the fix/chat-template-chatml-4150 branch from 48119ee to 56764e6 Compare March 19, 2026 10:50
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e84d889eef

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +682 to +683
sep_match = re.search(r"message\['role'\]\s*\+\s*'([^']*)'", chat_template)
separator = sep_match.group(1) if sep_match else "\\n"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle hard-coded ChatML role branches when inferring separator

Fresh evidence: unsloth/chat_templates.py:616-625 defines phi4_template with separate system/user/assistant branches, so the new regex here never matches. In that supported case, a saved Phi-4 tokenizer that is missing its trailing add_generation_prompt block still enters this branch, falls back to "\\n", and gets patched to '<|im_start|>assistant\n' instead of the required '<|im_start|>assistant<|im_sep|>', so inference prompts remain malformed for Phi-4-style tokenizers.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Hermes Lora chat test bug: does not have a {% if add_generation_prompt %} for generation purposes

1 participant