ollama: default to Responses API for built-ins#8798
ollama: default to Responses API for built-ins#8798etraut-openai merged 8 commits intoopenai:mainfrom
Conversation
This is an alternate PR to solving the same problem as <openai#8227>. In this PR, when Ollama is used via `--oss` (or via `model_provider = "ollama`), we default it to use the Responses format. At runtime, we do an Ollama version check, and if the version is older than when Responses support was added to Ollama, we print out a warning. Because there's no way of configuring the wire api for a built-in provider, we temporarily add a new `oss_provider`/`model_provider` called `"ollama-chat"` that will force the chat format. Once the `"chat"` format is fully removed (see <openai#7782>), `ollama-chat` can be removed as well
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 188273a3bb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /// Returns a deprecation notice if Ollama doesn't support the responses wire API. | ||
| pub async fn ollama_chat_deprecation_notice( | ||
| config: &Config, | ||
| ) -> io::Result<Option<DeprecationNoticeEvent>> { |
There was a problem hiding this comment.
This function returns Result, but it always returns the Ok variant, never Err.
If detect_wire_api returns Err, should that flow through?
Because if only Ok is possible, then the Result wrapper seems unnecessary.
There was a problem hiding this comment.
just pushed a new commit that makes it flow through and logs a warning. I don't think it should propagate as far as preventing startup (because the call itself just a best-effort explanation for users on older versions).
|
@drifkin, there there are build failures that need to be addressed. |
|
@drifkin, sorry, it looks like the build failures were due to a bad commit. I've updated your branch with the fix. |
|
merged main to resolve conflicts |
bolinfest
left a comment
There was a problem hiding this comment.
Nice job: thanks for your patience on this!
|
I think our Bazel builds are not set up to run for external contributors right now. (PSA @etraut-openai, @zbarsky-openai and I are looking into it...) |
This is an alternate PR to solving the same problem as #8227.
In this PR, when Ollama is used via
--oss(or viamodel_provider = "ollama"), we default it to use the Responses format. At runtime, we do an Ollama version check, and if the version is older than when Responses support was added to Ollama, we print out a warning.Because there's no way of configuring the wire api for a built-in provider, we temporarily add a new
oss_provider/model_providercalled"ollama-chat"that will force the chat format.Once the
"chat"format is fully removed (see #7782),ollama-chatcan be removed as well