Skip to content

Conversation

@olavloite
Copy link

As discussed in PR #5412 we should move away from custom retry loops and use a generic retry helper. This PR changes the retry loop of read/write transactions from a custom loop to using the Gax RetryHelper.

@olavloite olavloite requested a review from a team as a code owner June 6, 2019 12:03
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jun 6, 2019
@olavloite olavloite added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 6, 2019
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 6, 2019
throw e;
}
}
private <T> T runInternal(final TransactionCallable<T> txCallable) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind just briefly explaining what this change is? Hard to see from the diff. Is it mainly the return statement at the bottom?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The web diff view is indeed not very helpful in this case...

In general: Instead of using a while(true) {...} loop for retrying the transaction, the transaction is placed in a Callable that can be passed to the generic retry helper. The retry helper will handle the retries if the transaction aborts.

More specifically, the change is as follows:

  1. Everything inside the while(true) loop is moved into the call() method of the Callable retryCallable.
  2. The session.newTransaction() call is moved from the constructor and the backoff(...) method to inside the call() method to give each attempt a fresh transaction.
  3. The continue statement on line 354 is replaced with a throw statement in order to leave the method exceptionally.
  4. The Callable is passed in to the new method SpannerRetryHelper.runTxWithRetriesOnAborted(Callable) that handles the retries if the transaction is aborted.

while (true) {
checkState(
isValid, "TransactionRunner has been invalidated by a new operation on the session");
SpannerImpl.checkContext(context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the context is cancelled now? Are the same error messages thrown for deadline exceeded and cancelled?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the initial version of this PR, the context was ignored. I've added a check for context cancellation to the generic retry helper, and it will now throw the same exceptions as previous retry loops.

@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Jun 13, 2019
@codecov
Copy link

codecov bot commented Jun 17, 2019

Codecov Report

Merging #5433 into master will increase coverage by 0.58%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #5433      +/-   ##
============================================
+ Coverage      50.5%   51.09%   +0.58%     
- Complexity    23903    24475     +572     
============================================
  Files          2259     2271      +12     
  Lines        227870   232042    +4172     
  Branches      24994    25064      +70     
============================================
+ Hits         115088   118552    +3464     
- Misses       104163   104838     +675     
- Partials       8619     8652      +33
Impacted Files Coverage Δ Complexity Δ
...atransfer/v1/stub/GrpcDataTransferServiceStub.java 94.17% <0%> (-1.71%) 28% <0%> (+8%)
.../com/google/cloud/pubsub/v1/MessageDispatcher.java 85.86% <0%> (-0.63%) 23% <0%> (+1%)
...ogle/cloud/dialogflow/v2beta1/stub/AgentsStub.java 7.14% <0%> (-0.55%) 2% <0%> (+1%)
...om/google/cloud/dialogflow/v2/stub/AgentsStub.java 7.14% <0%> (-0.55%) 2% <0%> (+1%)
...loud/kms/v1/stub/GrpcKeyManagementServiceStub.java 97.1% <0%> (-0.47%) 60% <0%> (+31%)
.../datatransfer/v1/stub/DataTransferServiceStub.java 5.12% <0%> (-0.43%) 2% <0%> (+1%)
...om/google/cloud/logging/spi/v2/GrpcLoggingRpc.java 42.72% <0%> (-0.4%) 1% <0%> (ø)
...gle/cloud/kms/v1/KeyManagementServiceSettings.java 9.48% <0%> (-0.24%) 3% <0%> (+1%)
...in/java/com/google/cloud/datastore/BaseEntity.java 91.51% <0%> (ø) 28% <0%> (ø) ⬇️
.../cloud/automl/v1beta1/PredictionServiceClient.java 44.64% <0%> (ø) 11% <0%> (ø) ⬇️
... and 44 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bf84f6f...79803d8. Read the comment docs.

@kolea2 kolea2 added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 18, 2019
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 18, 2019
class SpannerRetryHelper {
private static final RetrySettings txRetrySettings =
RetrySettings.newBuilder()
.setInitialRetryDelay(Duration.ofMillis(1000L))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor - I would take these numbers and make them constants.

@kolea2 kolea2 merged commit d305770 into googleapis:master Jun 20, 2019
olavloite added a commit to olavloite/google-cloud-java that referenced this pull request Sep 15, 2019
A Spanner transaction runner that was given a session with a prepared
read/write transaction would not use that transaction, but instead
always start a new one, causing an extra unnecessary round-trip to the
server. This regression was introduced by commit
d305770 (PR googleapis#5433).
skuruppu pushed a commit that referenced this pull request Sep 17, 2019
A Spanner transaction runner that was given a session with a prepared
read/write transaction would not use that transaction, but instead
always start a new one, causing an extra unnecessary round-trip to the
server. This regression was introduced by commit
d305770 (PR #5433).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes This human has signed the Contributor License Agreement. 🚨 This issue needs some love.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants