Skip to content

add LlamaIndex zero-code OTLP example#134

Open
mesutoezdil wants to merge 4 commits into
agentevals-dev:mainfrom
mesutoezdil:feature/llama-index-zero-code-example
Open

add LlamaIndex zero-code OTLP example#134
mesutoezdil wants to merge 4 commits into
agentevals-dev:mainfrom
mesutoezdil:feature/llama-index-zero-code-example

Conversation

@mesutoezdil
Copy link
Copy Markdown
Contributor

@mesutoezdil mesutoezdil commented May 4, 2026

Closes #94

Adds a LlamaIndex zero-code OTLP example.

pip install -r examples/zero-code-examples/llama-index/requirements.txt
agentevals serve --dev
python examples/zero-code-examples/llama-index/run.py

Tested with Qwen3-32B on an OpenAI-compatible endpoint. All three queries ran and traces flushed.

@mesutoezdil mesutoezdil force-pushed the feature/llama-index-zero-code-example branch 5 times, most recently from 1b870bb to 606b84f Compare May 12, 2026 18:03
Copy link
Copy Markdown
Contributor

@krisztianfekete krisztianfekete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!
Have you checked LlamaIndex best practices around OTel instrumentation, e.g. https://developers.llamaindex.ai/python/framework-api-reference/observability/otel/?

The goal with these examples to leverage each framework's best practices and prove that agentevals can work meaningfully with that. If there are gaps, we should fallback to a similar approach we have here, and document this in details in the main docstring of the example.

Can you please see if you can use the official approach, and refactor the PR to follow it? I also added comments at various places where we'd like to follow existing patterns/conventions across all examples, so please also address those when you revisit the pull request.

load_dotenv(override=True)


def roll_die(sides: int) -> int:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add docstrings to tool calls.

result = await agent.run(q)
print(f" {result.response.content}")

tp.force_flush()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow this pattern to guarantee flushes:

try:
for i, query in enumerate(test_queries, 1):
print(f"\n[{i}/{len(test_queries)}] User: {query}")
result = agent.run_sync(query, message_history=message_history)
print(f" Agent: {result.output}")
# Pass the full message history forward for multi-turn conversation.
message_history = result.all_messages()
finally:
print()
tracer_provider.force_flush()
print("All traces flushed to OTLP receiver.")

Comment on lines +41 to +46
resource = Resource.create()
tp = TracerProvider(resource=resource)
tp.add_span_processor(BatchSpanProcessor(OTLPSpanExporter(), schedule_delay_millis=1000))
trace.set_tracer_provider(tp)
lp = LoggerProvider(resource=resource)
lp.add_log_record_processor(BatchLogRecordProcessor(OTLPLogExporter(), schedule_delay_millis=1000))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use longer, more descriptive variable names, like the other examples.

print("OPENAI_API_KEY not set.")
return

os.environ["OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT"] = "true"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use setdefault

agent = FunctionAgent(
tools=[FunctionTool.from_defaults(fn=roll_die), FunctionTool.from_defaults(fn=check_prime)],
llm=llm,
system_prompt="Use roll_die for dice rolls. Use check_prime to check if a number is prime.",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the same system prompt we have in all other examples. When a user runs multiple examples to compare frameworks, you don't want subtle differences like this that can affect results.

@krisztianfekete krisztianfekete added the Changes requested Waiting for submitter to make changes to their PR label May 13, 2026
@mesutoezdil mesutoezdil force-pushed the feature/llama-index-zero-code-example branch from 606b84f to 83f5f46 Compare May 13, 2026 19:25
@mesutoezdil
Copy link
Copy Markdown
Contributor Author

mesutoezdil commented May 13, 2026

@krisztianfekete i corrected them. I v checked the link, switched to LlamaIndexOpenTelemetry from llama-index-observability-otel which is the official approach.

It handles the tracer provider setup internally so the code is simpler now.

1 thing to note: this integration is span-based only, no log export.
And documented that in the docstring. Also fixed all the inline comments.

Uses official LlamaIndexOpenTelemetry integration. Adds e2e tests.
@mesutoezdil mesutoezdil force-pushed the feature/llama-index-zero-code-example branch from 83f5f46 to f456c3a Compare May 13, 2026 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changes requested Waiting for submitter to make changes to their PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add LlamaIndex zero code example

2 participants