-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Is your feature request related to a problem? Please describe.
google.cloud.pubsub_v1.publisher._batch.thread.Batch enforces max bytes on the sum of PubsubMessage.ByteSize for each message in the batch, but the PublishRequest created by Batch.client.publish is larger than that. As a result I need to specify a non default max bytes to guarantee batches create valid requests.
Describe the solution you'd like
Enforce max bytes on the size of the PublishRequest created by Batch.client.publish:
diff --git a/pubsub/google/cloud/pubsub_v1/publisher/_batch/thread.py b/pubsub/google/cloud/pubsub_v1/publisher/_batch/thread.py
index f187024b7c..67fc04841a 100644
--- a/pubsub/google/cloud/pubsub_v1/publisher/_batch/thread.py
+++ b/pubsub/google/cloud/pubsub_v1/publisher/_batch/thread.py
@@ -76,7 +76,7 @@ class Batch(base.Batch):
# any writes to them use the "state lock" to remain atomic.
self._futures = []
self._messages = []
- self._size = 0
+ self._size = types.PublishRequest(topic=topic, messages=[]).ByteSize()
self._status = base.BatchStatus.ACCEPTING_MESSAGES
# If max latency is specified, start a thread to monitor the batch and
@@ -281,7 +281,7 @@ class Batch(base.Batch):
if not self.will_accept(message):
return future
- new_size = self._size + message.ByteSize()
+ new_size = self._size + types.PublishRequest(messages=[message]).ByteSize()
new_count = len(self._messages) + 1
overflow = (
new_size > self.settings.max_bytesDescribe alternatives you've considered
I can set max bytes to guarantee sufficient overhead, but I feel like it would be better if I didn't need to and this may result in fewer batches.
Additional context
see also #7107, which suggests adding an option to enforce this setting even when the batch is empty.