Commit 72c7a27
authored
feat(storage): Enable full object checksum PR 1/3 : parse finalize_time and server crc32c in async object stream (#17261)
### 1. Overview of the Solution
This solution implements end-to-end full-object checksum validation in
`AsyncMultiRangeDownloader` for the asynchronous Google Cloud Storage
Python client library. As asynchronous multiplexed downloads of
non-contiguous ranges are performed concurrently over a single
bidirectional gRPC connection, this feature automatically and
incrementally calculates a rolling checksum as bytes arrive and
validates it against the server's authoritative object checksum once the
download completes.
The technical approach consists of three coordinated layers:
* **`_AsyncReadObjectStream` (Stream Ingestion)**: Safely extracts the
authoritative server checksum (`full_obj_server_crc32c`) and
finalization status (`is_finalized`) from the object metadata received
in the first data payload response of the stream.
* **`_ReadResumptionStrategy` & `_DownloadState` (Verification Logic)**:
Computes an isolated, persistent rolling checksum in the individual
`_DownloadState` object to ensure calculations do not bleed across
concurrent multiplexed ranges. Crucially, the rolling hash updates only
*after* buffer writes succeed to prevent state corruption during retry
re-connects, raising a `DataCorruption` exception on completion if a
mismatch occurs.
* **`AsyncMultiRangeDownloader` (Orchestration & Cleanup)**: Detects
candidate full-object ranges (e.g., `(0, 0)` or `(0, persisted_size)`),
propagates checksum settings to the resumption strategy, and guarantees
robust cleanup (closing the stream immediately and unregistering IDs) if
data corruption or write errors occur.
### 2. What This PR Specifically Does
This PR implements **Step 1: Stream Metadata Ingestion** of the
solution:
* Modifies `_AsyncReadObjectStream` to safely parse GCS object metadata
from the first data payload of the response.
* Populates `is_finalized`, `full_obj_server_crc32c`, and
`object_metadata` attributes in `_AsyncReadObjectStream.open()`.
* Adds an autouse pytest event loop fixture in `tests/unit/conftest.py`
to resolve compatibility issues with `pytest-asyncio` under Python
3.11+.
* Adds unit tests in `test_async_read_object_stream.py` to verify that
finalization status and server-authoritative checksums are correctly
extracted or skipped for unfinalized objects.1 parent d01a4ba commit 72c7a27
3 files changed
Lines changed: 85 additions & 1 deletion
File tree
- packages/google-cloud-storage
- google/cloud/storage/asyncio
- tests/unit
- asyncio
Lines changed: 15 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
82 | 85 | | |
83 | 86 | | |
84 | 87 | | |
| |||
132 | 135 | | |
133 | 136 | | |
134 | 137 | | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
135 | 150 | | |
136 | 151 | | |
137 | 152 | | |
| |||
Lines changed: 38 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
| 41 | + | |
42 | 42 | | |
43 | 43 | | |
| 44 | + | |
| 45 | + | |
44 | 46 | | |
45 | 47 | | |
46 | 48 | | |
| |||
130 | 132 | | |
131 | 133 | | |
132 | 134 | | |
| 135 | + | |
| 136 | + | |
133 | 137 | | |
134 | 138 | | |
135 | 139 | | |
| |||
381 | 385 | | |
382 | 386 | | |
383 | 387 | | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
0 commit comments