Skip to content

Conversation

@bolinfest
Copy link
Collaborator

@bolinfest bolinfest commented Dec 10, 2025

When I originally introduced accept_elicitation_for_prompt_rule() in #7617, it worked for me locally because I had run codex-rs/exec-server/tests/suite/bash once myself, which had the side-effect of installing the corresponding DotSlash artifact.

In CI, I added explicit logic to do this as part of .github/workflows/rust-ci.yml, which meant the test also passed in CI, but this logic should have been done as part of the test so that it would work locally for devs who had not installed the DotSlash artifact for codex-rs/exec-server/tests/suite/bash before. This PR updates the test to do this (and deletes the setup logic from rust-ci.yml), creating a new DOTSLASH_CACHE in a temp directory so that this is handled independently for each test.

While here, also added a check to ensure that the codex binary has been built prior to running the test, as we have to ensure it is symlinked as codex-linux-sandbox on Linux in order for the integration test to work on that platform.

@bolinfest bolinfest force-pushed the pr7832 branch 3 times, most recently from cd2b7e7 to 7aa1069 Compare December 10, 2025 19:42
@bolinfest bolinfest changed the title fix: fix accept_elicitation_for_prompt_rule() test fix: ensure accept_elicitation_for_prompt_rule() test passes locally Dec 10, 2025
@etraut-openai
Copy link
Collaborator

@codex review

@etraut-openai
Copy link
Collaborator

@bolinfest, looks like the test is failing in CI. It's also still failing for me locally. Here's the output I'm seeing.

test suite::accept_elicitation::accept_elicitation_for_prompt_rule ... FAILED

failures:

---- suite::accept_elicitation::accept_elicitation_for_prompt_rule stdout ----
Error: Mcp error: -32603: sandbox error: sandbox denied exec error, exit code: 1, stdout: , stderr: /opt/homebrew/Library/Homebrew/help.sh: line 38: cannot create temp file for here document: Operation not permitted
/opt/homebrew/Library/Homebrew/help.sh: line 38: cannot create temp file for here document: Operation not permitted
Execution denied: User declined execution

@chatgpt-codex-connector
Copy link
Contributor

Codex Review: Didn't find any major issues. 👍

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@bolinfest
Copy link
Collaborator Author

@etraut-openai not all the current test failures in the latest CI run are due to the test: I am looking into the flakiness because I expect an rmcp server to handle messages serially, but it's possible it is not.

As for your local test failure, I'm curious whether you can run any zsh command in a read-only sandbox. For example, what does this do on your laptop:

codex debug seatbelt -- zsh -lc 'echo hello'

@bolinfest
Copy link
Collaborator Author

@etraut-openai Codex found it. Apparently rmcp does a tokio::spawn() for each incoming message and notification:

https://github.com/modelcontextprotocol/rust-sdk/blob/f20ed202af7ff11ffbc1d6a9260139c657205018/crates/rmcp/src/service.rs#L763-L778

https://github.com/modelcontextprotocol/rust-sdk/blob/f20ed202af7ff11ffbc1d6a9260139c657205018/crates/rmcp/src/service.rs#L810-L815

This explains the flakiness: we send a notification followed by a request assuming the notification must finish processing before the request, but apparently this is not true.

I think this PR is still important to move the dotslash fetch logic into the test, but admittedly there's still:

  • the issue your seeing
  • the need for me to address the flakiness (I will probably have to change the notification to a request and wait for the response before making the tool call.)

@bolinfest
Copy link
Collaborator Author

@etraut-openai Hmm...

codex debug seatbelt --log-denials -- zsh -lc 'source /opt/homebrew/Library/Homebrew/help.sh'

fails with:

/opt/homebrew/Library/Homebrew/help.sh:12: can't create temp file for here document: operation not permitted

=== Sandbox denials ===
(zsh) sysctl-read security.mac.lockdown_mode_state
(zsh) sysctl-read kern.bootargs
(zsh) file-write-data /dev/dtracehelper
(zsh) file-write-data /dev/ttys033
(zsh) mach-lookup com.apple.system.notification_center
(zsh) mach-lookup com.apple.logd
(path_helper) sysctl-read security.mac.lockdown_mode_state
(path_helper) sysctl-read kern.bootargs
(path_helper) file-write-data /dev/dtracehelper
(bash) sysctl-read security.mac.lockdown_mode_state
(bash) sysctl-read kern.bootargs
(bash) file-write-data /dev/dtracehelper
(bash) file-write-data /dev/tty
(bash) file-write-data /dev/ttys033
(env) sysctl-read security.mac.lockdown_mode_state
(env) sysctl-read kern.bootargs
(env) file-write-data /dev/dtracehelper
(sw_vers) sysctl-read security.mac.lockdown_mode_state
(sw_vers) sysctl-read kern.bootargs
(sw_vers) file-write-data /dev/dtracehelper
(sw_vers) mach-lookup com.apple.system.notification_center
(sw_vers) mach-lookup com.apple.logd
(lsof) sysctl-read security.mac.lockdown_mode_state
(lsof) sysctl-read kern.bootargs
(lsof) file-write-data /dev/dtracehelper
(sed) sysctl-read security.mac.lockdown_mode_state
(sed) sysctl-read kern.bootargs
(sed) file-write-data /dev/dtracehelper
(basename) sysctl-read security.mac.lockdown_mode_state
(basename) sysctl-read kern.bootargs
(basename) file-write-data /dev/dtracehelper
(zsh) file-write-create /private/tmp/zshFSlml1

@bolinfest bolinfest merged commit 87f5b69 into main Dec 10, 2025
67 of 73 checks passed
@bolinfest bolinfest deleted the pr7832 branch December 10, 2025 23:17
@github-actions github-actions bot locked and limited conversation to collaborators Dec 10, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants