Skip to content

Conversation

@kabir
Copy link
Collaborator

@kabir kabir commented Feb 9, 2026

Addresses intermittent test failure where subscribeToTask created child queues
but EventConsumer polling loops hadn't started yet when the agent executed,
causing emitted events to be lost.

Root Cause:

  • awaitStreamingSubscription() only guarantees transport-level subscription
    (Flow.Subscriber.onSubscribe() called)
  • EventConsumer polling starts asynchronously on eventConsumerExecutor thread
  • Agent execution could begin before EventConsumer was actively polling
  • Events emitted before polling started were lost in the queue

Solution:

  • Added awaitChildQueueCountStable() to TestUtilsBean
  • Waits for child queue count to match expected value for 3 consecutive
    checks (150ms total), ensuring EventConsumer is actively polling
  • Follows pattern from commit 18d2abf which fixed similar race condition
  • Exposed REST endpoints in all transports (JSON-RPC, gRPC, REST)
  • Added client helper in AbstractA2AServerTest
  • Applied stability check after subscribeToTask() in tests
  • Increased timeouts from 10s to 15s for CI stability

Files Modified:

  • TestUtilsBean.java: Core synchronization logic with comprehensive Javadoc
  • A2ATestRoutes.java (JSON-RPC, REST): REST endpoints for stability checks
  • A2ATestResource.java (gRPC): REST endpoints for stability checks
  • AbstractA2AServerTest.java: Client helper and extensive inline comments

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kabir, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a race condition within the testNonBlockingWithMultipleMessages integration test. It introduces new utility methods and corresponding test endpoints to allow tests to explicitly wait for the internal queue poller to start and for child queue counts to stabilize. This ensures that subsequent test actions, such as sending messages, occur only when the system is in a predictable and ready state, thereby preventing intermittent test failures.

Highlights

  • Race Condition Fix: Addressed a race condition in the testNonBlockingWithMultipleMessages test by introducing explicit waiting mechanisms to ensure the child queue poller is stable before proceeding with test actions.
  • New Test Utility Methods: Introduced awaitQueuePollerStart and awaitChildQueueCountStable methods in TestUtilsBean to provide better control and synchronization for test scenarios involving message queues.
  • New Test Endpoints: Exposed new test-specific REST/JSON-RPC endpoints in A2ATestResource and A2ATestRoutes to allow tests to remotely invoke the awaitChildQueueCountStable utility.
  • Child Queue Stability Logic: Implemented a robust stability check in awaitChildQueueCountStable that requires the child queue count to match the expected value for three consecutive checks (over 150ms) before considering the queue stable.
Changelog
  • reference/grpc/src/test/java/io/a2a/server/grpc/quarkus/A2ATestResource.java
    • Added new POST endpoints /queue/awaitPollerStart/{taskId} and /queue/awaitChildCountStable/{taskId}/{expectedCount}/{timeoutMs} for test utilities.
  • reference/jsonrpc/src/test/java/io/a2a/server/apps/quarkus/A2ATestRoutes.java
    • Added a new POST route /test/queue/awaitChildCountStable/:taskId/:expectedCount/:timeoutMs for test utility.
  • reference/rest/src/test/java/io/a2a/server/rest/quarkus/A2ATestRoutes.java
    • Added a new POST route /test/queue/awaitChildCountStable/:taskId/:expectedCount/:timeoutMs for test utility.
  • tests/server-common/src/test/java/io/a2a/server/apps/common/AbstractA2AServerTest.java
    • Modified testNonBlockingWithMultipleMessages to include a call to awaitChildQueueCountStable to ensure queue stability.
    • Added a private helper method awaitChildQueueCountStable to invoke the new test endpoint.
  • tests/server-common/src/test/java/io/a2a/server/apps/common/TestUtilsBean.java
    • Added awaitQueuePollerStart method to wait for a queue poller to start.
    • Added awaitChildQueueCountStable method to check and wait for the stability of child queue counts, requiring three consecutive matches.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a race condition in testNonBlockingWithMultipleMessages by introducing a new test utility awaitChildQueueCountStable. This utility waits for the queue state to stabilize before proceeding, which makes the test more robust. The changes are well-implemented across the gRPC, JSON-RPC, and REST test harnesses. I've added a few suggestions to improve the implementation of the new test utilities for better robustness and readability.

@ehsavoie
Copy link
Collaborator

It would be nice to have a bit more Javadoc and comment in the commit message :)

@kabir kabir force-pushed the racecondition branch 2 times, most recently from 9cdc294 to 3a1bf2e Compare February 10, 2026 13:22
Addresses intermittent test failure where subscribeToTask created child queues
but EventConsumer polling loops hadn't started yet when the agent executed,
causing emitted events to be lost.

Root Cause:
- awaitStreamingSubscription() only guarantees transport-level subscription
  (Flow.Subscriber.onSubscribe() called)
- EventConsumer polling starts asynchronously on eventConsumerExecutor thread
- Agent execution could begin before EventConsumer was actively polling
- Events emitted before polling started were lost in the queue

Solution:
- Added awaitChildQueueCountStable() to TestUtilsBean
- Waits for child queue count to match expected value for 3 consecutive
  checks (150ms total), ensuring EventConsumer is actively polling
- Follows pattern from commit 18d2abf which fixed similar race condition
- Exposed REST endpoints in all transports (JSON-RPC, gRPC, REST)
- Added client helper in AbstractA2AServerTest
- Applied stability check after subscribeToTask() in tests
- Increased timeouts from 10s to 15s for CI stability

Files Modified:
- TestUtilsBean.java: Core synchronization logic with comprehensive Javadoc
- A2ATestRoutes.java (JSON-RPC, REST): REST endpoints for stability checks
- A2ATestResource.java (gRPC): REST endpoints for stability checks
- AbstractA2AServerTest.java: Client helper and extensive inline comments
@kabir
Copy link
Collaborator Author

kabir commented Feb 10, 2026

@ehsavoie done

1 similar comment
@kabir
Copy link
Collaborator Author

kabir commented Feb 10, 2026

@ehsavoie done

@ehsavoie ehsavoie merged commit ce6453c into a2aproject:main Feb 10, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants