Skip to content

Conversation

@plamut
Copy link
Contributor

@plamut plamut commented Apr 14, 2021

Fixes #372.

If a test is run in a suite with other system tests, the messages are not always published in batch sizes as desired, which can affect how ACKs are handled on the backend (the server requires all messages published in a single batch to be ACK-ed in order
to accept the ACKs).

If a publisher client instance is shared between the tests, the batching can apparently be affected, thus we create a new client
instance before each test. Since these tests are slow system tests, the overhead should not be significant.

Steps to verify

Apply the following patch to the code, disable capturing test output (-s option to the py.test command).

diff --git google/cloud/pubsub_v1/publisher/_batch/thread.py google/cloud/pubsub_v1/publisher/_batch/thread.py
index 36dd3b9..c3aacdf 100644
--- google/cloud/pubsub_v1/publisher/_batch/thread.py
+++ google/cloud/pubsub_v1/publisher/_batch/thread.py
@@ -211,6 +211,7 @@ class Batch(base.Batch):
         commit_thread = threading.Thread(
             name="Thread-CommitBatchPublisher", target=self._commit
         )
+        print(f"batch starting commit for {len(self._messages)} message(s)")
         commit_thread.start()

     def _commit(self):

Run system tests in --verbose mode and check the output.

Actual result (before the PR fix)
The system test test_streaming_pull_blocking_shutdown fails with a high probability.

Expected result

The system test test_streaming_pull_blocking_shutdown consistently passes.

In addition, wherever the _publish_messages(..., batch_sizes=[...]) helper is used in the tests, the output size output should match the values passed through batch_sizes. For example, in test_streaming_pull_max_messages(), batch_sizes is set to (7, 4, 8, 2, 10, 1, 3, 8, 6, 1), and the test output matches that:

...
tests/system.py::TestStreamingPull::test_streaming_pull_max_messages batch starting commit for 7 message(s)
batch starting commit for 4 message(s)
batch starting commit for 8 message(s)
batch starting commit for 2 message(s)
batch starting commit for 10 message(s)
batch starting commit for 1 message(s)
batch starting commit for 3 message(s)
batch starting commit for 8 message(s)
batch starting commit for 6 message(s)
batch starting commit for 1 message(s)
...

PR checklist:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

If a test is run in a suite with other system tests, the messages
are not always published in batch sizes as desired, which can
affect how ACKs are handled on the backend (the server requires
all messages published in a single batch to be ACK-ed in order
to accept the ACKs).

If a publisher client instance is shared between the tests, the
batching can apparently be affected, thus we create a new client
instance before each test.

Since these tests are slow system tests, the overhead should not
be significant.
@plamut plamut requested a review from a team as a code owner April 14, 2021 13:12
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Apr 14, 2021
@product-auto-label product-auto-label bot added the api: pubsub Issues related to the googleapis/python-pubsub API. label Apr 14, 2021
@plamut plamut requested review from jimfulton and pradn April 14, 2021 13:31
@plamut
Copy link
Contributor Author

plamut commented Apr 14, 2021

@jimfulton If you have time and because you are now somewhat familiar with these tests, you too can check this.

Although you said that this particular test was not failing for you locally?

@jimfulton
Copy link
Contributor

Right, I haven't seen this failure.

Copy link
Contributor

@jimfulton jimfulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the rationale for the fix is that subscriber client state was leaking between tests and causing spurious failures.

@plamut plamut merged commit 13a6686 into googleapis:master Apr 15, 2021
@plamut plamut deleted the iss-372 branch April 15, 2021 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: pubsub Issues related to the googleapis/python-pubsub API. cla: yes This human has signed the Contributor License Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The test tests.system.test_streaming_pull_blocking_shutdown is flaky

2 participants