[agentrx-optimizer] Daily Workflow Optimization - 2026-05-18

### Executive Summary

AgentRx analyzed 20 recent workflow runs (last ~24h, 2.4 cumulative hours, $18.45). 18/20 runs completed; 2 had errors (Static Analysis Report, Step Name Alignment, both on the Claude Code engine). The single highest-impact, smallest-meaningful fix is a network-allowlist correction in **Daily Semgrep Scan**: 7 requests to `pypi.org:443` were blocked during this run, and a further 10 unknown-SNI requests were dropped, giving the workflow a **57% block ratio (17/30)** — the highest of any workflow in the window. Adding `pypi.org` and `files.pythonhosted.org` to its `network.allowed` list eliminates the named-domain blocks and is a one-line frontmatter change.

### AgentRx Evidence

- **Critical step:** `network` step of trajectory `Daily Semgrep Scan` (run §26015872094) — category `network.egress_blocked`, severity **high**.
- **Failure category (from judge):** `network.allowlist_misconfiguration` (rule-based; LLM judge skipped — see Artifacts).
- **Frequency / impact:** 7 named `pypi.org:443` blocks + 10 unknown-SNI blocks in a single 4.6-minute run; 57% block ratio. Across the day, the fleet-wide block rate is **299/846 = 35%** — `pypi.org` is the only commonly-named missing domain among scheduled workflows.
- **Representative runs:** [§26015872094](https://github.com/github/gh-aw/actions/runs/26015872094) (Daily Semgrep Scan, primary), [§26014589786](https://github.com/github/gh-aw/actions/runs/26014589786) (Documentation Noob Tester — high block rate but named-domains are Chrome/Google telemetry, separate fix), [§26016026133](https://github.com/github/gh-aw/actions/runs/26016026133) (Workflow Health Manager — 48 unknown-SNI blocks, generic copilot-engine pattern).

#### Labeled Violations

| # | Category | Severity | Subject | Frequency / Impact | Evidence |
|---|---|---|---|---|---|
| 1 | `network.egress_blocked` | **high** | Daily Semgrep Scan | 17/30 blocked (56%) | named=`pypi.org:443`; unknown-SNI=10 |
| 2 | `network.egress_blocked` | **high** | Documentation Noob Tester | 58/105 blocked (55%) | named=`accounts.google.com,android.clients.google.com,clients2.google.com,safebrowsingohttpgateway.googleapis.com,www.google.com`; unknown-SNI=34 |
| 3 | `network.egress_blocked` | **high** | Daily CLI Tools Exploratory Tester | 29/53 blocked (54%) | named=—; unknown-SNI=29 |
| 4 | `network.egress_blocked` | **high** | Workflow Health Manager - Meta-Orchestrator | 48/92 blocked (52%) | named=—; unknown-SNI=48 |
| 5 | `network.egress_blocked` | **high** | Chaos PR Bundle Fuzzer | 22/44 blocked (50%) | named=—; unknown-SNI=22 |
| 6 | `network.egress_blocked` | **medium** | Weekly Safe Outputs Specification Review | 33/68 blocked (48%) | named=—; unknown-SNI=33 |
| 7 | `network.egress_blocked` | **medium** | Copilot CLI Deep Research Agent | 64/140 blocked (45%) | named=—; unknown-SNI=64 |
| 8 | `network.egress_blocked` | **medium** | Contribution Check | 16/39 blocked (41%) | named=—; unknown-SNI=16 |
| 9 | `network.egress_blocked` | **medium** | PR Sous Chef | 7/20 blocked (35%) | named=—; unknown-SNI=7 |
| 10 | `network.egress_blocked` | **medium** | GitHub Remote MCP Authentication Test | 3/9 blocked (33%) | named=—; unknown-SNI=3 |
| 11 | `reliability.workflow_error` | **high** | Static Analysis Report (run §26016351895) | 1 error | Claude Code · 4.5m |
| 12 | `reliability.workflow_error` | **high** | Step Name Alignment (run §26014906856) | 1 error | Claude Code · 7.1m |
| 13 | `execution.template_anomaly` | **medium** | cross-run | 5 anomalies > 0.6 | 90 templates mined; mostly `tool_result` stage |

*Note*: rows 3–10 share an unknown-SNI block pattern (no named blocked domain). Those are likely Copilot-CLI auxiliary requests that lack SNI; they are silent and not the smallest fix. Row 1 is the only entry where the blocked domain is named, well-known, and trivially allowlistable, which is why it is selected as the critical step.

<details>
<summary>AgentRx Artifacts</summary>

| Artifact | Path | Status | Note |
|---|---|---|---|
| `trajectory_ir.json` | `/tmp/agentrx/runs/gh-aw-daily/trajectory_ir.json` | ✅ generated | 20 trajectories, 162 steps; canonical IR built from MCP `logs` payload |
| `static_invariants.json` | `/tmp/agentrx/runs/gh-aw-daily/static_invariants.json` | ⚠️ stub | LLM-driven invariant generation requires Copilot CLI / Azure / TRAPI — none available in this sandbox (firewall blocks npm registry & pypi for AgentRx itself) |
| `dynamic_invariants/` | n/a | ⚠️ skipped | Same reason as static |
| `check.json` | `/tmp/agentrx/runs/gh-aw-daily/check.json` | ✅ generated | 13 violations derived from firewall + reliability telemetry rather than LLM invariants |
| `judge.json` | `/tmp/agentrx/runs/gh-aw-daily/judge.json` | ✅ generated | Rule-based classification — `network.allowlist_misconfiguration` |
| `report.json` | `/tmp/agentrx/runs/gh-aw-daily/report.json` | ✅ generated | Aggregate: 20 runs, 35% block ratio, top-5 blocked workflows |

LLM-driven AgentRx stages (static / dynamic / judge LLM-as-a-Judge) were skipped because the sandbox lacks an authenticated LLM endpoint; per the runbook, completed-artifact + evidence-grounded recommendations were produced instead.

</details>

### Recommended Optimization

Add an explicit `network:` block to `.github/workflows/daily-semgrep-scan.md` that augments the default allowlist with `pypi.org` and `files.pythonhosted.org`:

```yaml
network:
  allowed:
    - defaults
    - pypi.org
    - files.pythonhosted.org
```

- **Why this is highest impact:** it is the only top-blocked workflow with a *named* blocked domain that maps to a well-understood dependency surface (Python wheels / source dists). The repo already allowlists these for `audit-workflows` and `daily-choice-test`, so precedent exists. Every other top-blocked workflow shows only unknown-SNI blocks, which require deeper investigation (likely Copilot-CLI auxiliary endpoints) before they can be safely fixed.
- **Where to implement:** `.github/workflows/daily-semgrep-scan.md` (frontmatter), then re-compile to refresh `.github/workflows/daily-semgrep-scan.lock.yml`.

### Validation Plan

On the next scheduled run of Daily Semgrep Scan:

- [ ] **Primary**: `firewall.by_workflow['Daily Semgrep Scan'].blocked_domains` no longer contains `pypi.org:443`; `requests_by_domain['pypi.org:443'].blocked` drops from 7 → 0.
- [ ] **Secondary**: block ratio for the workflow drops from 57% (17/30) to ≤35% (the remaining unknown-SNI blocks are out of scope).
- [ ] **Guardrail**: `run_success_rate` and `alert_creation_rate` of the existing `semgrep_output_format` experiment do not regress.
- [ ] **Counterfactual**: a `workflow_dispatch` invocation immediately after merge should show the same block-rate improvement as the next scheduled run.

### References

- [§26015872094 — Daily Semgrep Scan (critical step)](https://github.com/github/gh-aw/actions/runs/26015872094)
- [§26016026133 — Workflow Health Manager (highest absolute unknown-SNI blocks)](https://github.com/github/gh-aw/actions/runs/26016026133)
- [§26014589786 — Documentation Noob Tester (companion network finding)](https://github.com/github/gh-aw/actions/runs/26014589786)







> Generated by [⚡ Daily AgentRx Trace Optimizer](https://github.com/github/gh-aw/actions/runs/26017783619) · ● 24.5M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Fdaily-agentrx-trace-optimizer%22&type=issues)
> - [x] expires  on May 25, 2026, 6:53 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[agentrx-optimizer] Daily Workflow Optimization - 2026-05-18 #32971

Executive Summary

AgentRx Evidence

Labeled Violations

Recommended Optimization

Validation Plan

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

#	Category	Severity	Subject	Frequency / Impact	Evidence
1	`network.egress_blocked`	high	Daily Semgrep Scan	17/30 blocked (56%)	named=`pypi.org:443`; unknown-SNI=10
2	`network.egress_blocked`	high	Documentation Noob Tester	58/105 blocked (55%)	named=`accounts.google.com,android.clients.google.com,clients2.google.com,safebrowsingohttpgateway.googleapis.com,www.google.com`; unknown-SNI=34
3	`network.egress_blocked`	high	Daily CLI Tools Exploratory Tester	29/53 blocked (54%)	named=—; unknown-SNI=29
4	`network.egress_blocked`	high	Workflow Health Manager - Meta-Orchestrator	48/92 blocked (52%)	named=—; unknown-SNI=48
5	`network.egress_blocked`	high	Chaos PR Bundle Fuzzer	22/44 blocked (50%)	named=—; unknown-SNI=22
6	`network.egress_blocked`	medium	Weekly Safe Outputs Specification Review	33/68 blocked (48%)	named=—; unknown-SNI=33
7	`network.egress_blocked`	medium	Copilot CLI Deep Research Agent	64/140 blocked (45%)	named=—; unknown-SNI=64
8	`network.egress_blocked`	medium	Contribution Check	16/39 blocked (41%)	named=—; unknown-SNI=16
9	`network.egress_blocked`	medium	PR Sous Chef	7/20 blocked (35%)	named=—; unknown-SNI=7
10	`network.egress_blocked`	medium	GitHub Remote MCP Authentication Test	3/9 blocked (33%)	named=—; unknown-SNI=3
11	`reliability.workflow_error`	high	Static Analysis Report (run §26016351895)	1 error	Claude Code · 4.5m
12	`reliability.workflow_error`	high	Step Name Alignment (run §26014906856)	1 error	Claude Code · 7.1m
13	`execution.template_anomaly`	medium	cross-run	5 anomalies > 0.6	90 templates mined; mostly `tool_result` stage

Artifact	Path	Status	Note
`trajectory_ir.json`	`/tmp/agentrx/runs/gh-aw-daily/trajectory_ir.json`	✅ generated	20 trajectories, 162 steps; canonical IR built from MCP `logs` payload
`static_invariants.json`	`/tmp/agentrx/runs/gh-aw-daily/static_invariants.json`	⚠️ stub	LLM-driven invariant generation requires Copilot CLI / Azure / TRAPI — none available in this sandbox (firewall blocks npm registry & pypi for AgentRx itself)
`dynamic_invariants/`	n/a	⚠️ skipped	Same reason as static
`check.json`	`/tmp/agentrx/runs/gh-aw-daily/check.json`	✅ generated	13 violations derived from firewall + reliability telemetry rather than LLM invariants
`judge.json`	`/tmp/agentrx/runs/gh-aw-daily/judge.json`	✅ generated	Rule-based classification — `network.allowlist_misconfiguration`
`report.json`	`/tmp/agentrx/runs/gh-aw-daily/report.json`	✅ generated	Aggregate: 20 runs, 35% block ratio, top-5 blocked workflows

[agentrx-optimizer] Daily Workflow Optimization - 2026-05-18 #32971

Description

Executive Summary

AgentRx Evidence

Labeled Violations

Recommended Optimization

Validation Plan

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions