Conversation Simulator not working - turns empty #2528

Maxi-J · 2026-03-04T15:22:33Z

Maxi-J
Mar 4, 2026

To test a multi-agent system (with openai-agents-python), I wanted to try DeepEval for qualitative evaluation/testing.
Unfortunately, the ConversationSimulator doesn't seem to generate or propagate the generated turns for the test case as expected. Sometimes, like 20% of the time, it works though - perhaps a timing issue.
I tried to stay as close as possible to the documentation of DeepEval (Conv Simulator, CI/CD, E2E), but it seems partially outdated as many argument names changed meanwhile.

I'm using DeepEval 3.8.8

Error: TypeError: 'turns' must not be empty

Code:

from deepeval.metrics import ConversationCompletenessMetric
from deepeval.simulator import ConversationSimulator
from deepeval.dataset import ConversationalGolden
from deepeval.test_case import ConversationalTestCase
from deepeval import assert_test
import pytest

from agent_system.agent_handler import model_callback


knowledge_excel = ConversationalGolden(
    scenario="User wants to know the formula for an if statement in excel.",
    expected_outcome="The formula for an if statement in excel is briefly explained.",
    user_description="User is a Excel beginner",
    multimodal=False,
)
knowledge_dvelop = ConversationalGolden(
    scenario="User wants to know how to digitally sign documents.",
    expected_outcome="The process for digitally signing documents is briefly explained.",
    user_description="User is a digital beginner",
    multimodal=False,
)
simulator = ConversationSimulator(model_callback=model_callback, simulator_model="gpt-5.1")  # type: ignore
conv_test_cases = simulator.simulate(conversational_goldens=[knowledge_excel, knowledge_dvelop], max_user_simulations=5)


@pytest.mark.parametrize("test_case", conv_test_cases)
def test_knowledge_questions(test_case: ConversationalTestCase):
    assert_test(test_case=test_case, metrics=[ConversationCompletenessMetric()])

Function definition for model_callback: async def model_callback(input: str, thread_id: str) -> Turn:
Adding turns to the function definition didn't change anything either.

Calling the test with deepeval test run tests/agent-tests/test_knowledge.py and yes, I also tried the non-pytest way with evaluate() - same result. Also disabling async didn't help.

Am I doing something wrong or is there a bug? Thanks!

Maxi-J · 2026-03-05T15:58:04Z

Maxi-J
Mar 5, 2026
Author

Turns out the issue was a missing/wrong configuration for the used AzureOpenAI model. It just silently failed and therefore no turns were generated.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conversation Simulator not working - turns empty #2528

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Conversation Simulator not working - turns empty #2528

Uh oh!

Maxi-J Mar 4, 2026

Replies: 1 comment

Uh oh!

Maxi-J Mar 5, 2026 Author

Maxi-J
Mar 4, 2026

Maxi-J
Mar 5, 2026
Author