Skip to content

Latest commit

 

History

History

README.md

Scripts for testing and manual verification

Tool Reliability Scoring (v1.3)

Run from the project root (where devsper.toml or pyproject.toml lives).

Quick test (CLI only)

./scripts/test_tool_scoring_cli.sh
  • Runs unit tests, then exercises devsper tools, devsper tools --poor, devsper doctor, devsper analytics.

Full test (all CLI commands)

./scripts/test_tool_scoring_full.sh
  • Unit tests plus: devsper tools, devsper tools --category research, devsper tools --poor, devsper doctor, devsper analytics.

Optional: seed DB for a populated table

uv run python scripts/seed_tool_scores.py
uv run devsper tools
uv run devsper analytics

Python smoke test

uv run python scripts/test_tool_scoring_smoke.py
  • Uses a temporary DB: records results, checks composite score and labels, selector blend, reset, prune. No CLI.

One-liners (copy-paste)

# Unit tests only
uv run python -m pytest tests/test_tool_scoring.py -v

# CLI: list tools (scores if any)
uv run devsper tools

# CLI: only poor tools
uv run devsper tools --poor

# CLI: by category
uv run devsper tools --category research

# Doctor (includes scoring DB info)
uv run devsper doctor

# Analytics (includes tool report when scores exist)
uv run devsper analytics

# Reset one tool (replace TOOL_NAME)
uv run devsper tools reset TOOL_NAME

# Bypass scoring in selection (env)
DEVSPER_DISABLE_TOOL_SCORING=1 uv run devsper run "list files"