Skip to content

Issue 614 spring webclient tracing#831

Merged
tylerbenson merged 16 commits intoDataDog:masterfrom
johanvandeweerd:issue-614-spring-webclient-tracing
Jul 23, 2019
Merged

Issue 614 spring webclient tracing#831
tylerbenson merged 16 commits intoDataDog:masterfrom
johanvandeweerd:issue-614-spring-webclient-tracing

Conversation

@ghost
Copy link

@ghost ghost commented May 14, 2019

Enabled tracing for Spring WebClient.

Heavily inspired on Opentracing Spring Web.

TODO:
(-) Rework check to only apply the advice once in DefaultWebClientAdvice.java
(-) Licensing

@ghost ghost mentioned this pull request May 14, 2019
Copy link
Contributor

@tylerbenson tylerbenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome. Thanks for taking this on. Please see feedback below.
Also, could you add some tests? You can use HttpClientTest as a base (for example), but feel free to add additional tests that might be unique to webflux.

Also, what was your concern about licensing?
Good work!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is not really needed... we added it to okhttp client because we use that internally to report traces to the agent.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for the header is needed because otherwise the advice over and over again and results in a stackoverflow.
It's not very clean so any suggestions for a more subtle solution are more than welcome.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you could explain why it's called over and over again? (Maybe in a comment would be good if we can't figure out a better alternative.) I'm afraid I don't really understand the issue.

Copy link

@johanvandeweerd johanvandeweerd Jun 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This advice returns a TracingClientResponseMono which is a decorator around the Mono that is returned from the original ExchangeFunction.exchange().
The TracingClientResponseMono class itself also calls ExchangeFunction.exchange() which triggers this advice again.
If the advice is applied the first time, the x-datadog-trace-id header is added to the client request.
If the advice is applied a second time (checked by looking at the x-datadog-trace-id header on the ClientRequest), it won't decorate the result again otherwise this would lead to a StackOverflowError.

Hope this clarifies the issue. If you have a better solution, please share.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just out of curiosity, under what scenarios would this be called more than once?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onNext had to be implemented because it's part of the interface of CoreSubscriber.
Could this be called more than once in case of an TEXT_EVENT_STREAM?

@johanvandeweerd
Copy link

@tylerbenson I've added SpringWebfluxHttpClientTest based on ApacheHttpClientTest but the test is failing locally. The first test (basic #method request) is expecting 2 traces but got 3 traces and the span the webclient created doesn't seem to have the same trace id. When I used the webclient instrumentation on one of my projects, it did work.
Any suggestions?

About the licensing: I've copied some code from Opentracing Spring Web but I don't know if all conditions are met (License and copyright notice, State changes) and I'm not that familiar with licensing.

Copy link
Contributor

@tylerbenson tylerbenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor requests, and some suggestions on how to work with the tests.

Nice work. I think we're almost done here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is no longer used.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: rewStatusCode

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you could explain why it's called over and over again? (Maybe in a comment would be good if we can't figure out a better alternative.) I'm afraid I don't really understand the issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this might be a bit messy, but try adding a setupSpec that disables the netty instrumentation via system property (and maybe retransforms the relevant classes?) and an cleanupSpec that does the opposite (so as to not impact the other tests).

Another idea is to possibly override the clientSpan method, but I'm not positive that would work since it's not abstract.

Other ideas?

Last resort, I guess you could just copy the whole test and I can clean it up later.

@tylerbenson tylerbenson added this to the 0.28.0 milestone May 16, 2019
@labbati labbati modified the milestones: 0.28.0, 0.29.0 May 17, 2019
@tylerbenson
Copy link
Contributor

Do you think you'll be able to make these last few changes, or should I try implementing them?

@johanvandeweerd
Copy link

I've been rather busy the past week. Will pick this up tomorrow.

@tylerbenson
Copy link
Contributor

tylerbenson commented May 25, 2019

@johanvandeweerd @labbati I decided to try my suggestion (about overriding the clientSpan method) on a different test with a similar problem: OkHttp. I made progress and was able to demonstrate my ideas for how to solve it, but was not able to finish. You can see my work here: a2537b5

@tylerbenson tylerbenson removed this from the 0.29.0 milestone May 29, 2019
@tylerbenson
Copy link
Contributor

That work has been merged into master (#860). You can follow what I did for the OkHttpClientTest.
Thanks for working on this!

@ghost ghost self-requested a review as a code owner June 7, 2019 10:51
@johanvandeweerd johanvandeweerd force-pushed the issue-614-spring-webclient-tracing branch from 3275ba7 to 27a0749 Compare June 7, 2019 11:05
@tylerbenson
Copy link
Contributor

I looked into this a bit and there seems to be an issue with the netty client span not being linked to the parent span correctly. I'm not exactly sure where it's breaking, but my guess is in the async reactor call.

I don't know if it will help, but here's an example of how I instrumented rxjava in hystrix: https://github.com/DataDog/dd-trace-java/blob/master/dd-java-agent/instrumentation/hystrix-1.4/src/main/java/datadog/trace/instrumentation/hystrix/HystrixInstrumentation.java#L99

Let us know if there's anything else we can help with.

@tylerbenson
Copy link
Contributor

@johanvandeweerd I think I have a fix for this, but it will depend on #917 and #886. Can you give me push access to your repo and I can finish updating this?

@johanvandeweerd johanvandeweerd force-pushed the issue-614-spring-webclient-tracing branch from 27a0749 to b348f51 Compare July 16, 2019 08:01
@johanvandeweerd
Copy link

@johanvandeweerd I think I have a fix for this, but it will depend on #917 and #886. Can you give me push access to your repo and I can finish updating this?

Done.

And sorry for the lack of effort lately.

@tylerbenson tylerbenson force-pushed the issue-614-spring-webclient-tracing branch from b348f51 to aaba3fc Compare July 22, 2019 18:20
@tylerbenson tylerbenson added this to the 0.31.0 milestone Jul 22, 2019
Copy link
Contributor

@tylerbenson tylerbenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanvandeweerd want to take a look at my changes to make sure I didn't miss anything?

I'd also like a second review from @labbati @mar-kolya or @randomanderson, but looks good to me now.

Copy link
Contributor

@randomanderson randomanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. The only slightly weird thing is that the advice needs to prevent an infinite loop and I see that was already discussed.

I wasn't sure stylistically about saving Tracer to a field vs using GlobalTracer.get() everywhere but I checked and it's not uncommon.

@tylerbenson tylerbenson merged commit 0a79235 into DataDog:master Jul 23, 2019
@tylerbenson tylerbenson deleted the issue-614-spring-webclient-tracing branch July 23, 2019 14:50
@johanvandeweerd
Copy link

@tylerbenson I've locally build the dd-java-agent and tested in a small demo application. Everything looks good!
Any idea when this will be GA?
Tnx for the effort and work you put into this.

@tylerbenson
Copy link
Contributor

@johanvandeweerd Thanks to you too for getting it started. You did most of the work.

It will be included in the next release, likely this week or next.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants