Hi maintainers, I’d like to propose a small docs/examples contribution for RAG evaluation.
Problem
Promptfoo already supports RAG evaluation patterns and assertions, but users new to RAG eval often struggle with a practical question:
“My RAG system is failing — which failure mode is this, and what Promptfoo test should I write for it?”
Proposed contribution
I’d like to add a compact RAG failure-mode checklist that maps common RAG failures to concrete Promptfoo eval scenarios, suggested assertions, and debugging hints.
Initial scope:
- Missing retrieved context
- Irrelevant retrieved context
- Retrieved context contains the answer but the model ignores it
- Answer overclaims beyond the provided context
- Fabricated citation/source
- Metadata/source not preserved
- Conflicting context not surfaced
- Refusal despite sufficient context
For each failure mode, I’d include:
- what it looks like
- why it matters
- suggested Promptfoo assertion(s)
- minimal YAML test case
- short debugging/fix hint
Suggested format
I can keep this as a docs/examples-only contribution.
Possible locations:
site/docs/guides/rag-failure-modes.md
- or
examples/rag-failure-modes/README.md + promptfooconfig.yaml
I’m happy to align with the maintainers’ preferred location and naming.
Non-goals
- No core Promptfoo changes
- No new assertion types
- No new RAG framework
- No changes to existing evaluation semantics
The goal is only to make existing Promptfoo RAG evaluation capabilities easier to apply.
Hi maintainers, I’d like to propose a small docs/examples contribution for RAG evaluation.
Problem
Promptfoo already supports RAG evaluation patterns and assertions, but users new to RAG eval often struggle with a practical question:
Proposed contribution
I’d like to add a compact RAG failure-mode checklist that maps common RAG failures to concrete Promptfoo eval scenarios, suggested assertions, and debugging hints.
Initial scope:
For each failure mode, I’d include:
Suggested format
I can keep this as a docs/examples-only contribution.
Possible locations:
site/docs/guides/rag-failure-modes.mdexamples/rag-failure-modes/README.md+promptfooconfig.yamlI’m happy to align with the maintainers’ preferred location and naming.
Non-goals
The goal is only to make existing Promptfoo RAG evaluation capabilities easier to apply.