Skip to content

Timestamp loss of precision in insert_rows #392

@mhfrantz

Description

@mhfrantz

Certain values of datetime lose a single microsecond on their way into BigQuery. This appears to happen because of the JSON rendering of timestamps as a floating point number of seconds.

Environment details

  • OS type and version: Ubuntu Bionic
  • Python version: 3.6.9
  • pip version: 9.0.1
  • google-cloud-bigquery version: 1.28.0

Steps to reproduce

Here is an example of a datetime value that loses a microsecond:

datetime(2020, 11, 17, 1, 6, 52, 353795, tzinfo=datetime.timezone(datetime.timedelta(0), 'UTC'))

As a full integration repro, you could try something like this:

  1. Create a BigQuery table with a TIMESTAMP column.
  2. Use insert_rows to insert the value above.
  3. Inspect the results in the Console, where it will show as 2020-11-17 01:06:52.353794 UTC instead of 2020-11-17 01:06:52.353795 UTC.

Code example

An easier repro hints at the root cause. By looking at the output of _timestamp_to_json_row, it appears that the problematic values are rounded down to microsecond precision.

import datetime
from google.cloud import bigquery as bq
ts = datetime.datetime(2020, 11, 17, 1, 6, 52, 353795, tzinfo=datetime.timezone(datetime.timedelta(0), 'UTC'))
print(bq._helpers._timestamp_to_json_row(ts))

This will print:

1605575212.3537948

Perhaps the solution is to add a half microsecond (0.5e-6) so that truncating results in rounding to the nearest microsecond.

Metadata

Metadata

Labels

api: bigqueryIssues related to the googleapis/python-bigquery API.priority: p1Important issue which blocks shipping the next release. Will be fixed prior to next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions