Skip to content

[Codex][CLI] Gate image inputs by model modalities#10271

Merged
ccy-oai merged 8 commits intomainfrom
ccy/model-input-modalities-config
Feb 3, 2026
Merged

[Codex][CLI] Gate image inputs by model modalities#10271
ccy-oai merged 8 commits intomainfrom
ccy/model-input-modalities-config

Conversation

@ccy-oai
Copy link
Collaborator

@ccy-oai ccy-oai commented Jan 30, 2026

Summary
  • Add input_modalities to model metadata so clients can determine supported input types.
  • Gate image paste/attach in TUI when the selected model does not support images.
  • Block submits that include images for unsupported models and show a clear warning.
  • Propagate modality metadata through app-server protocol/model-list responses.
  • Update related tests/fixtures.
Rationale
  • Models support different input modalities.
  • Clients need an explicit capability signal to prevent unsupported requests.
  • Backward-compatible defaults preserve existing behavior when modality metadata is absent.
Scope
  • codex-rs/protocol, codex-rs/core, codex-rs/tui
  • codex-rs/app-server-protocol, codex-rs/app-server
  • Generated app-server types / schema fixtures
Trade-offs
  • Default behavior assumes text + image when field is absent for compatibility.
  • Server-side validation remains the source of truth.
Follow-up
  • Non-TUI clients should consume input_modalities to disable unsupported attachments.
  • Model catalogs should explicitly set input_modalities for text-only models.
Testing
  • cargo fmt --all
  • cargo test -p codex-tui
  • env -u GITHUB_APP_KEY cargo test -p codex-core --lib
  • just write-app-server-schema
  • cargo run -p codex-cli --bin codex -- app-server generate-ts --out app-server-types
  • test against local backend
image

@etraut-openai etraut-openai added the oai PRs contributed by OpenAI employees label Jan 30, 2026
Copy link

@wiltzius-openai wiltzius-openai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but someone with more client familiarity should take a look too

@ccy-oai
Copy link
Collaborator Author

ccy-oai commented Feb 2, 2026

Going to merge in an hour or so to keep things moving and unblock app/extension.
@aibrahim-oai @pakrym

@ccy-oai ccy-oai marked this pull request as ready for review February 2, 2026 18:11
Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 82210d2a10

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 3164 to 3167
if !local_images.is_empty() && !self.current_model_supports_images() {
let local_image_paths = local_images.iter().map(|img| img.path.clone()).collect();
self.bottom_pane
.set_composer_text(text, text_elements, local_image_paths);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve mention paths when blocking image submits

When a submit is blocked for unsupported images, this path restores the draft via set_composer_text, which clears mention_paths in the composer (set_text_content resets self.mention_paths). If the draft includes tool/skill mentions (e.g., $file inserted via the mention popup), those linked paths are lost; after the user removes images and re‑submits, collect_tool_mentions will see the mention names but no linked paths, so the tool references silently drop. This only occurs when the model rejects images, so the regression is tied to this new gating behavior.

Useful? React with 👍 / 👎.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarifying “mention linkage”: when a user picks an item from the $-mention popup, we store both (1) the visible token in the draft text (for example $foo) and (2) an internal mapping mention_paths[name] -> canonical target (skill file path or app://... connector id).

In this blocked-image path we restore the draft via set_composer_text, which currently clears mention_paths. That means the user still sees $foo after the warning, but we’ve lost the canonical target they selected.

On retry (after removing images), mention parsing falls back to name-only heuristics. Unambiguous cases may still work, but ambiguous/colliding names can resolve to the wrong target or be dropped. So this is user-visible as “I picked a mention, but after the image warning + retry it didn’t reference what I selected.”

Also, this affects both skill mentions and connector/app mentions (not just skills), because both rely on the same linkage map.

@joshka-oai
Copy link
Collaborator

Added some docs in docs: clarify model input modality contracts

@joshka-oai joshka-oai force-pushed the ccy/model-input-modalities-config branch from ee38e70 to addb4a4 Compare February 2, 2026 19:54
/// failures do not hard-block user input in the UI.
fn current_model_supports_images(&self) -> bool {
let model = self.current_model();
self.models_manager
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this info should be on ModelPreset for tui and on the Model info app-server sends to client

#[strum(serialize_all = "lowercase")]
pub enum InputModality {
/// Plain text turns and tool payloads.
#[default]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default shouldn't be text. Let's not have a default

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be Image+Text as default unless model_cache says differently

pub fn default_input_modalities() -> Vec<InputModality> {
    vec![InputModality::Text, InputModality::Image]
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Collaborator

@aibrahim-oai aibrahim-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not have a default and always have the two text and image as values. Also, add the information on ModelPreset so we don't need to list models again

@ccy-oai
Copy link
Collaborator Author

ccy-oai commented Feb 2, 2026

Let's not have a default and always have the two text and image as values. Also, add the information on ModelPreset so we don't need to list models again

Could we get there in two stages? I'd like to avoid having an opening surface area for regression w/ backward compatibility this close to a release cycle. How does this sound

  1. Go out with what we have now: Default [Image, Text] - This maintains backward compatibility with older clients and is equivalent to present state
  2. After the launches, we make sure statsig dynamic configs all backfill [Image, Text] and then we pull this default away, if needed for cleaner config.

@ccy-oai ccy-oai force-pushed the ccy/model-input-modalities-config branch from baafcf0 to 921ee7c Compare February 3, 2026 01:46
ccy-oai and others added 8 commits February 2, 2026 17:47
Document protocol compatibility defaults for input_modalities.
Clarify TUI image attach/paste gating invariants and warnings.
Add model metadata update guidance for contributors.
Restore mention_paths when submit is blocked for unsupported images so
$-mentions keep their canonical targets after users remove images and
retry.

Add a chatwidget regression test covering blocked-image draft restore
with image placeholders and mention links.
@ccy-oai ccy-oai force-pushed the ccy/model-input-modalities-config branch from 921ee7c to 431331e Compare February 3, 2026 01:51
@ccy-oai ccy-oai merged commit 7e07ec8 into main Feb 3, 2026
32 checks passed
@ccy-oai ccy-oai deleted the ccy/model-input-modalities-config branch February 3, 2026 02:56
@github-actions github-actions bot locked and limited conversation to collaborators Feb 3, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

oai PRs contributed by OpenAI employees

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants