BUG: fix _fix_chat_template for ChatML templates missing add_generation_prompt (#4150) by kimimgo · Pull Request #4426 · unslothai/unsloth

kimimgo · 2026-03-18T14:31:30Z

Problem

ChatML-style templates (Hermes-3, Magnum-v2, etc.) saved after LoRA training may end
with {% endfor %} and have no {% if add_generation_prompt %} block. This causes
fix_chat_template() to raise:

RuntimeError: Unsloth: The tokenizer `...` does not have a
{% if add_generation_prompt %} for generation purposes.

Root Cause

_fix_chat_template() only patches templates where content ({{ ... }}) follows
{% endfor %}. When the template simply ends after {% endfor %} (empty suffix),
the function returns the template unchanged, and the caller raises RuntimeError.

Fix

Add an elif branch in _fix_chat_template() that detects ChatML templates
(containing <|im_start|>) with an empty suffix after endfor, and appends
the standard ChatML generation prompt block:

{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}

This is a general fix (not model-name-based bypass) that handles any ChatML-style
template missing the generation prompt block.

…on_prompt (unslothai#4150) ChatML-style templates (Hermes, Magnum, etc.) saved after LoRA training may end with {% endfor %} and have no {% if add_generation_prompt %} block. The existing _fix_chat_template only handled the case where content ({{ ... }}) followed endfor, not the case where the template simply ended. Add an elif branch that detects ChatML templates (containing <|im_start|>) with an empty suffix after endfor, and appends the standard ChatML generation prompt block. Fixes unslothai#4150

gemini-code-assist · 2026-03-18T14:31:47Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical runtime error in the tokenizer utility that prevented certain ChatML-style templates from being correctly processed after LoRA training. By enhancing the template fixing logic, it ensures that all valid ChatML templates, even those missing the generation prompt block, are properly formatted, thereby improving the robustness and compatibility of the tokenizer.

Highlights

Bug Fix: Resolved a RuntimeError that occurred when _fix_chat_template() processed ChatML-style templates (e.g., Hermes-3, Magnum-v2) that ended with {% endfor %} but lacked an {% if add_generation_prompt %} block.
Root Cause Addressed: The previous implementation of _fix_chat_template() failed to patch templates where content (like {{ ... }}) did not follow {% endfor %} and the suffix was empty, leading to the template being returned unchanged and subsequent errors.
Template Patching Logic: Introduced an elif branch in _fix_chat_template() to specifically detect ChatML templates (identified by <|im_start|>) with an empty suffix after endfor. For these cases, the standard ChatML generation prompt block {% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %} is now appended.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request addresses a bug in _fix_chat_template() that caused RuntimeError for ChatML-style templates missing the add_generation_prompt block. The fix adds an elif branch to handle templates with an empty suffix after endfor, appending the standard ChatML generation prompt block. The review focuses on the correctness and maintainability of the added code, ensuring it effectively resolves the issue and integrates well with the existing logic.

gemini-code-assist · 2026-03-18T14:36:15Z

unsloth/tokenizer_utils.py

+    elif after_endfor.strip() == "" and "<|im_start|>" in chat_template:
+        # GH#4150: ChatML-style templates (Hermes, etc.) that end with
+        # {% endfor %} and have no add_generation_prompt block.
+        # Append the standard ChatML generation prompt.
+        generation_block = (
+            "{%" + dash + " if add_generation_prompt %}"
+            "{{ '<|im_start|>assistant\n' }}"
+            "{%" + dash + " endif %}"
+        )
+        chat_template = chat_template[: where + len(chosen_end)] + generation_block
+
    return chat_template


This elif block appends a generation prompt to ChatML templates that have an empty suffix after endfor. It's important to ensure that this addition doesn't inadvertently affect other types of templates or introduce unintended side effects. Consider adding a more specific check to ensure this logic only applies to the intended ChatML templates to avoid potential issues in the future.

It is also important to ensure that the appended generation_block is correctly formatted and compatible with all ChatML-style templates.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 48119ee39c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-18T14:36:41Z

unsloth/tokenizer_utils.py

+        generation_block = (
+            "{%" + dash + " if add_generation_prompt %}"
+            "{{ '<|im_start|>assistant\n' }}"


Don't hard-code <|im_start|>assistant\n for all ChatML fixes

This new branch assumes every <|im_start|> template should generate with '<|im_start|>assistant\n', but we already ship supported variants that use a different assistant prefix. For example, unsloth/chat_templates.py:614-626 defines Phi-4 with '<|im_start|>assistant<|im_sep|>'; if a saved Phi-4 tokenizer loses its add_generation_prompt block and hits _fix_chat_template(), this patch rewrites it to the wrong protocol and produces malformed prompts at inference time. The fix needs to preserve the model-specific assistant prefix instead of treating all ChatML-style templates as Hermes-style newline prompts.

Useful? React with 👍 / 👎.

…t prefix Address review feedback: Phi-4 uses <|im_sep|> instead of newline after role. Extract the separator from the template pattern (message['role'] + 'SEPARATOR') to generate the correct assistant prefix for each ChatML variant.

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e84d889eef

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-19T10:56:29Z

unsloth/tokenizer_utils.py

+        sep_match = re.search(r"message\['role'\]\s*\+\s*'([^']*)'", chat_template)
+        separator = sep_match.group(1) if sep_match else "\\n"


Handle hard-coded ChatML role branches when inferring separator

Fresh evidence: unsloth/chat_templates.py:616-625 defines phi4_template with separate system/user/assistant branches, so the new regex here never matches. In that supported case, a saved Phi-4 tokenizer that is missing its trailing add_generation_prompt block still enters this branch, falls back to "\\n", and gets patched to '<|im_start|>assistant\n' instead of the required '<|im_start|>assistant<|im_sep|>', so inference prompts remain malformed for Phi-4-style tokenizers.

Useful? React with 👍 / 👎.

kimimgo requested review from danielhanchen and mmathew23 as code owners March 18, 2026 14:31

gemini-code-assist bot reviewed Mar 18, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 18, 2026

View reviewed changes

kimimgo force-pushed the fix/chat-template-chatml-4150 branch from 48119ee to 56764e6 Compare March 19, 2026 10:50

[pre-commit.ci] auto fixes from pre-commit.com hooks

e84d889

for more information, see https://pre-commit.ci

chatgpt-codex-connector bot reviewed Mar 19, 2026

View reviewed changes

mykeehu mentioned this pull request Mar 25, 2026

Fix Hermes chat template validation error #4246

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BUG: fix _fix_chat_template for ChatML templates missing add_generation_prompt (#4150)#4426

BUG: fix _fix_chat_template for ChatML templates missing add_generation_prompt (#4150)#4426
kimimgo wants to merge 3 commits intounslothai:mainfrom
kimimgo:fix/chat-template-chatml-4150

kimimgo commented Mar 18, 2026

Uh oh!

gemini-code-assist bot commented Mar 18, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 18, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 18, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		sep_match = re.search(r"message\['role'\]\s\+\s'([^']*)'", chat_template)
		separator = sep_match.group(1) if sep_match else "\\n"

Uh oh!

Conversation

kimimgo commented Mar 18, 2026

Problem

Root Cause

Fix

Uh oh!

gemini-code-assist bot commented Mar 18, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant