BigTable: On read_row(), provide default to retrieve only most recent cell values

The current signature on the Bigtable `Table` [`read_row()` method][2] has a default of `filter_=None`:

```python
def read_row(self, row_key, filter_=None):
    ...
```

In cases where a cell value may have been updated multiple times, the default will be to return the full time series with timestamps for each value which can slow down read performance in a non-obvious way.  

In the current Python API the [`cells()` method][1] on `row_data` (`PartialRowData`) makes a deep copy of the cells, which compounds the performance issue.  

```python
@property
def cells(self):
    """Property returning all the cells accumulated on this partial row.

    :rtype: dict
    :returns: Dictionary of the :class:`Cell` objects accumulated. This
              dictionary has two-levels of keys (first for column families
              and second for column names/qualifiers within a family). For
              a given column, a list of :class:`Cell` objects is stored.
    """
    return copy.deepcopy(self._cells)
```

Consider:

1. Making a default filter on `read_row()` to retrieve only the most recent value of any cell unless the full or partial time series is requested.
1. Allowing a `ColumnFamily` to implicitly or explicitly limit cells to only one value (no timeseries).
1. Adding a `cell_value(column_family_id, column, index=0)` method to `row_data` (`PartialRowData`) to allow more efficient retrieval of a single cell value.

[1]: https://github.com/GoogleCloudPlatform/google-cloud-python/blob/bigtable-0.28.1/bigtable/google/cloud/bigtable/row_data.py#L153-L163
[2]: https://github.com/GoogleCloudPlatform/google-cloud-python/blob/bigtable-0.28.1/bigtable/google/cloud/bigtable/table.py#L246

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BigTable: On read_row(), provide default to retrieve only most recent cell values #4468

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BigTable: On read_row(), provide default to retrieve only most recent cell values #4468

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions