Profiler: Do not record zero duration kernel events #41540

mwootton · 2020-07-16T17:04:27Z

Changes in the ROCm runtime have improved hipEventRecord. The events no longer take ~4 usec to execute on the gpu stream, instead they appear instantaneous. If you record two events, with no other activity in between, then they will have the same timestamp and the elapsed duration will be 0.

The profiler uses hip/cuda event pairs to infer gpu execution times. It wraps functions whether they send work to the gpu or not. Functions that send no gpu work will show as having zero duration. Also they will show as running at the same time as neighboring functions. On a trace, all those functions combine into a 'call stack' that can be tens of functions tall (when indeed they should be sequential).

This patch suppresses recording the zero duration 'kernel' events, leaving only the CPU execution part. This means functions that do not use the GPU do not get an entry for how long they were using the GPU, which seams reasonable. This fixes the 'stacking' on traces. It also improves the signal to noise of the GPU trace beyond what was available previously.

This patch will not effect CUDA or legacy ROCm as those are not able to 'execute' eventRecord markers instantaneously.

mwootton · 2020-07-16T22:17:18Z

@sunway513
@jeffdaily

jeffdaily · 2020-07-16T22:46:22Z

@apaszke, @albanD, as owners of the file we're touching, we'd appreciate a review. The change is quite small.

albanD

Looks good. Thanks for the PR!

facebook-github-bot

@albanD has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-07-17T19:38:32Z

@albanD merged this pull request in 7eb71b4.

Profiler: Do not record zero duration kernel events

3577be6

mwootton requested review from albanD and apaszke as code owners July 16, 2020 17:04

pytorchbot added the open source label Jul 16, 2020

smessmer added oncall: profiler profiler-related issues (cpu, gpu, kineto) triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jul 16, 2020

jeffdaily added the module: rocm AMD GPU support for Pytorch label Jul 16, 2020

albanD approved these changes Jul 17, 2020

View reviewed changes

facebook-github-bot reviewed Jul 17, 2020

View reviewed changes

facebook-github-bot closed this in 7eb71b4 Jul 17, 2020

facebook-github-bot added the merged label Jul 17, 2020

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Profiler: Do not record zero duration kernel events #41540

Profiler: Do not record zero duration kernel events #41540

Uh oh!

mwootton commented Jul 16, 2020

Uh oh!

mwootton commented Jul 16, 2020

Uh oh!

jeffdaily commented Jul 16, 2020

Uh oh!

albanD left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot commented Jul 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Profiler: Do not record zero duration kernel events #41540

Profiler: Do not record zero duration kernel events #41540

Uh oh!

Conversation

mwootton commented Jul 16, 2020

Uh oh!

mwootton commented Jul 16, 2020

Uh oh!

jeffdaily commented Jul 16, 2020

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants