Skip to content

[agentrx-optimizer] Daily Workflow Optimization - 2026-05-18 #32971

@github-actions

Description

@github-actions

Executive Summary

AgentRx analyzed 20 recent workflow runs (last ~24h, 2.4 cumulative hours, $18.45). 18/20 runs completed; 2 had errors (Static Analysis Report, Step Name Alignment, both on the Claude Code engine). The single highest-impact, smallest-meaningful fix is a network-allowlist correction in Daily Semgrep Scan: 7 requests to pypi.org:443 were blocked during this run, and a further 10 unknown-SNI requests were dropped, giving the workflow a 57% block ratio (17/30) — the highest of any workflow in the window. Adding pypi.org and files.pythonhosted.org to its network.allowed list eliminates the named-domain blocks and is a one-line frontmatter change.

AgentRx Evidence

  • Critical step: network step of trajectory Daily Semgrep Scan (run §26015872094) — category network.egress_blocked, severity high.
  • Failure category (from judge): network.allowlist_misconfiguration (rule-based; LLM judge skipped — see Artifacts).
  • Frequency / impact: 7 named pypi.org:443 blocks + 10 unknown-SNI blocks in a single 4.6-minute run; 57% block ratio. Across the day, the fleet-wide block rate is 299/846 = 35%pypi.org is the only commonly-named missing domain among scheduled workflows.
  • Representative runs: §26015872094 (Daily Semgrep Scan, primary), §26014589786 (Documentation Noob Tester — high block rate but named-domains are Chrome/Google telemetry, separate fix), §26016026133 (Workflow Health Manager — 48 unknown-SNI blocks, generic copilot-engine pattern).

Labeled Violations

# Category Severity Subject Frequency / Impact Evidence
1 network.egress_blocked high Daily Semgrep Scan 17/30 blocked (56%) named=pypi.org:443; unknown-SNI=10
2 network.egress_blocked high Documentation Noob Tester 58/105 blocked (55%) named=accounts.google.com,android.clients.google.com,clients2.google.com,safebrowsingohttpgateway.googleapis.com,www.google.com; unknown-SNI=34
3 network.egress_blocked high Daily CLI Tools Exploratory Tester 29/53 blocked (54%) named=—; unknown-SNI=29
4 network.egress_blocked high Workflow Health Manager - Meta-Orchestrator 48/92 blocked (52%) named=—; unknown-SNI=48
5 network.egress_blocked high Chaos PR Bundle Fuzzer 22/44 blocked (50%) named=—; unknown-SNI=22
6 network.egress_blocked medium Weekly Safe Outputs Specification Review 33/68 blocked (48%) named=—; unknown-SNI=33
7 network.egress_blocked medium Copilot CLI Deep Research Agent 64/140 blocked (45%) named=—; unknown-SNI=64
8 network.egress_blocked medium Contribution Check 16/39 blocked (41%) named=—; unknown-SNI=16
9 network.egress_blocked medium PR Sous Chef 7/20 blocked (35%) named=—; unknown-SNI=7
10 network.egress_blocked medium GitHub Remote MCP Authentication Test 3/9 blocked (33%) named=—; unknown-SNI=3
11 reliability.workflow_error high Static Analysis Report (run §26016351895) 1 error Claude Code · 4.5m
12 reliability.workflow_error high Step Name Alignment (run §26014906856) 1 error Claude Code · 7.1m
13 execution.template_anomaly medium cross-run 5 anomalies > 0.6 90 templates mined; mostly tool_result stage

Note: rows 3–10 share an unknown-SNI block pattern (no named blocked domain). Those are likely Copilot-CLI auxiliary requests that lack SNI; they are silent and not the smallest fix. Row 1 is the only entry where the blocked domain is named, well-known, and trivially allowlistable, which is why it is selected as the critical step.

AgentRx Artifacts
Artifact Path Status Note
trajectory_ir.json /tmp/agentrx/runs/gh-aw-daily/trajectory_ir.json ✅ generated 20 trajectories, 162 steps; canonical IR built from MCP logs payload
static_invariants.json /tmp/agentrx/runs/gh-aw-daily/static_invariants.json ⚠️ stub LLM-driven invariant generation requires Copilot CLI / Azure / TRAPI — none available in this sandbox (firewall blocks npm registry & pypi for AgentRx itself)
dynamic_invariants/ n/a ⚠️ skipped Same reason as static
check.json /tmp/agentrx/runs/gh-aw-daily/check.json ✅ generated 13 violations derived from firewall + reliability telemetry rather than LLM invariants
judge.json /tmp/agentrx/runs/gh-aw-daily/judge.json ✅ generated Rule-based classification — network.allowlist_misconfiguration
report.json /tmp/agentrx/runs/gh-aw-daily/report.json ✅ generated Aggregate: 20 runs, 35% block ratio, top-5 blocked workflows

LLM-driven AgentRx stages (static / dynamic / judge LLM-as-a-Judge) were skipped because the sandbox lacks an authenticated LLM endpoint; per the runbook, completed-artifact + evidence-grounded recommendations were produced instead.

Recommended Optimization

Add an explicit network: block to .github/workflows/daily-semgrep-scan.md that augments the default allowlist with pypi.org and files.pythonhosted.org:

network:
  allowed:
    - defaults
    - pypi.org
    - files.pythonhosted.org
  • Why this is highest impact: it is the only top-blocked workflow with a named blocked domain that maps to a well-understood dependency surface (Python wheels / source dists). The repo already allowlists these for audit-workflows and daily-choice-test, so precedent exists. Every other top-blocked workflow shows only unknown-SNI blocks, which require deeper investigation (likely Copilot-CLI auxiliary endpoints) before they can be safely fixed.
  • Where to implement: .github/workflows/daily-semgrep-scan.md (frontmatter), then re-compile to refresh .github/workflows/daily-semgrep-scan.lock.yml.

Validation Plan

On the next scheduled run of Daily Semgrep Scan:

  • Primary: firewall.by_workflow['Daily Semgrep Scan'].blocked_domains no longer contains pypi.org:443; requests_by_domain['pypi.org:443'].blocked drops from 7 → 0.
  • Secondary: block ratio for the workflow drops from 57% (17/30) to ≤35% (the remaining unknown-SNI blocks are out of scope).
  • Guardrail: run_success_rate and alert_creation_rate of the existing semgrep_output_format experiment do not regress.
  • Counterfactual: a workflow_dispatch invocation immediately after merge should show the same block-rate improvement as the next scheduled run.

References

Generated by ⚡ Daily AgentRx Trace Optimizer · ● 24.5M ·

  • expires on May 25, 2026, 6:53 AM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions