Skip to content

Conversation

@tswast
Copy link
Contributor

@tswast tswast commented Jan 29, 2019

In response to feedback internally Bug 123578325, add a sample (acts as a system test, too) which shows how to populate the total_rows value.

@tswast tswast requested a review from shollyman January 29, 2019 19:15
@tswast tswast requested a review from crwilcox as a code owner January 29, 2019 19:15
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jan 29, 2019
results = query_job.result() # Waits for query to complete.
next(iter(results)) # Fetch the first page of results, which contains total_rows.
print("Got {} rows.".format(results.total_rows))
# [START bigquery_query_total_rows]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

END?



def test_client_query_total_rows(client, capsys):
"""Run a query an just check for how many rows."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/an/and/

@tseaver tseaver added api: bigquery Issues related to the BigQuery API. type: docs Improvement to the documentation for an API. labels Jan 29, 2019
@tswast tswast merged commit 51c97cd into googleapis:master Jan 30, 2019
@tswast tswast deleted the b123578325-total-rows branch January 30, 2019 00:42
@yan-hic
Copy link

yan-hic commented Jan 30, 2019

Would fetching the first page of result() not make an API call ? If so, why not reading as per #6117. _query_results of the query job does get updated by result()

@tswast
Copy link
Contributor Author

tswast commented Jan 30, 2019

@yiga2 Calling result() makes an API request, but not to tabledata.list, where we get the total_rows from. As you point out in #6117, the request to wait for the job to complete actually contains a total_rows value as well, but we don't pass that through to the RowIterator.

I agree that the way I show in this sample is a bit awkward, and I was thinking about fetching the first page automatically in #4152, but I changed my mind about prefetching, since it could mean an extra unnecessary API request if someone just cares that a query completes and not the actual result rows.

Perhaps we could find a way to pass the total_rows through to the RowIterator after it is constructed in result() to avoid the extra API request.

@yan-hic
Copy link

yan-hic commented Mar 15, 2019

@tswast any further consideration on getting total_rows without an add'l API call ?

@tswast
Copy link
Contributor Author

tswast commented Mar 15, 2019

@yiga2 I still think it's a good idea. I just haven't gotten around to. We're open to PRs. The change would likely be to the result method of QueryJob in job.py.

@yan-hic
Copy link

yan-hic commented Mar 18, 2019

@tswast Feel free to bundle with other enhancements as PR would be very light otherwise.

Suggested code change, after

schema = self._query_results.schema

total_rows = self._query_results.total_rows

@yan-hic
Copy link

yan-hic commented Apr 3, 2019

@tswast any feedback on this ?

@tswast
Copy link
Contributor Author

tswast commented Apr 3, 2019

@yiga2 I've prepared #7622 which addresses this issue, but in a slightly more complicated way than we propose here because I wanted to also handle more cases where Client.list_rows is called directly. My PR has had one pass at review, but I'm waiting on a follow-up now that I've addressed the requested changes.

@yan-hic
Copy link

yan-hic commented Apr 3, 2019

Cool - thanks Tim !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the BigQuery API. cla: yes This human has signed the Contributor License Agreement. type: docs Improvement to the documentation for an API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants