Conversation
rivudhk
commented
Mar 20, 2026
- Set start_from: "earliest" for existing variants to prevent the consumer from missing messages
- Add a delay after pipeline starts for variants without pre-created topics
- Use UUID suffixes for view and topic names to prevent schema/topic collisions
- Remove default 'num_partitions' and 'replication_factor' in create_kafka_topic (previously set to 1)
- Increase 'timeout_s' to 3600 seconds for 'wait_for_rows'
- Rename variable 'futures' in 'create_kafka_topics' to 'tpcs' for consistency with 'delete_kafka_topics'
- Remove PYTEST_EXTRA_ARGS in YAML workflows to enable the Kafka tests
- Add variable RUN_ID: 1 in YAML workflow for Python workload tests for the Kafka test
- Set start_from: "earliest" for existing variants to prevent the consumer from missing messages - Add a delay after pipeline starts for variants without pre-created topics - Use UUID suffixes for view and topic names to prevent schema/topic collisions - Remove default 'num_partitions' and 'replication_factor' in create_kafka_topic (previously set to 1) - Increase 'timeout_s' to 3600 seconds for 'wait_for_rows' - Rename variable 'futures' in 'create_kafka_topics' to 'tpcs' for consistency with' delete_kafka_topics' - Remove PYTEST_EXTRA_ARGS in YAML workflows to enable the Kafka tests - Add variable RUN_ID: 1 in YAML workflow for Python workload tests for the Kafka test Signed-off-by: rivudhk <rivudhkr@gmail.com>
mythical-fred
left a comment
There was a problem hiding this comment.
Commit subject ends in a colon ("as follows:") making it a fragment rather than a standalone summary. Please rewrite, e.g.: "py: fix Kafka Avro tests: topic naming, start_from, timeouts".
| futures = get_kafka_admin().create_topics([new_topic]) | ||
| for t, f in futures.items(): | ||
| tpcs = get_kafka_admin().create_topics([new_topic]) | ||
| for topic, tpcs in tpcs.items(): |
There was a problem hiding this comment.
Variable shadowing: tpcs (the dict from create_topics) is immediately reused as the loop variable in for topic, tpcs in tpcs.items(). Python evaluates .items() before entering the loop so it works, but it is confusing — and it mirrors the same pattern in delete_kafka_topics. Suggest using fut for the loop variable:
for topic, fut in tpcs.items():
try:
fut.result()|
|
||
| # For variants without pre-created topics, wait for the output connector to create them | ||
| if not v.create_topic: | ||
| time.sleep(3) |
There was a problem hiding this comment.
Unconditional time.sleep(3) to wait for auto-topic creation is a flakiness risk — three seconds may not be enough under load. Is there a way to poll until the topic is visible (e.g. via AdminClient.list_topics()) instead of sleeping?
gz
left a comment
There was a problem hiding this comment.
is there code that cleans up the topics
-need to clean up topics if test is successful
-need to clean up old topics that have been used by old failed tests after a few days
| - name: Run platform tests | ||
| if: ${{ vars.CI_DRY_RUN != 'true' }} | ||
| run: uv run --locked pytest -n ${{ vars.PYTEST_WORKERS }} tests/platform --timeout=1500 -vv ${{ vars.PYTEST_EXTRA }} | ||
| run: uv run --locked pytest -n ${{ vars.PYTEST_WORKERS }} tests/platform --timeout=1500 -vv |
There was a problem hiding this comment.
lets keep the variable in yaml for the future but we can remove the test exclusion from
the variable
| - name: Python workload tests | ||
| if: ${{ vars.CI_DRY_RUN != 'true' && !contains(vars.CI_SKIP_JOBS, 'runtime-workload') }} | ||
| run: uv run --locked pytest -n ${{ vars.PYTEST_WORKERS }} tests/workloads --timeout=3600 -vv ${{ vars.PYTEST_EXTRA }} | ||
| run: uv run --locked pytest -n ${{ vars.PYTEST_WORKERS }} tests/workloads --timeout=3600 -vv |
|
|
||
| self.topic1 = f"my_topic_avro{suffix}" | ||
| self.topic2 = f"my_topic_avro2{suffix}" | ||
| self.topic1 = f"my_topic_avro_{self.id}_{suffix}" |
There was a problem hiding this comment.
this start with the name of the test instead of my_... will make it easier to attribute the topic to a test in the kafka instance
| self.topic1 = f"my_topic_avro{suffix}" | ||
| self.topic2 = f"my_topic_avro2{suffix}" | ||
| self.topic1 = f"my_topic_avro_{self.id}_{suffix}" | ||
| self.topic2 = f"my_topic_avro2_{self.id}_{suffix}" |
There was a problem hiding this comment.
It is manually set in the TEST_CONFIGS array within the TestKafkaAvro class.
|
|
||
| # For variants without pre-created topics, wait for the output connector to create them | ||
| if not v.create_topic: | ||
| time.sleep(3) |