Skip to content

Conversation

@durumu
Copy link
Contributor

@durumu durumu commented Jul 6, 2020

Roughly a 22x speedup over the code this replaces when tested on ResNet18 on a devvm using CPU only, using default parameters for HistogramObserver (i.e. 2048 bins). The script I ran to test this is here.

Roughly a 14x speedup when tested using the benchmark from #42138 (also CPU only).

Stack from ghstack:

Differential Revision: D22400755

durumu added a commit that referenced this pull request Jul 6, 2020
@dr-ci
Copy link

dr-ci bot commented Jul 6, 2020

💊 CI failures summary and remediations

As of commit 11ce38b (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 17 times.

@vkuzo
Copy link
Contributor

vkuzo commented Jul 6, 2020

nice! In general looks good, can we just add to the test plan:

  • how did we measure performance
  • how did we verify numerical correctness

durumu added a commit that referenced this pull request Jul 7, 2020
norm = norm + _get_norm(delta_begin, delta_end, density, norm_type)
return norm

src_bin = torch.arange(self.bins).numpy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyTorch doesn't have a NumPy dependency for its functionality (although we do for some tests), and we shouldn't use NumPy functionality in lieu of our own. Uses of NumPy should be restricted to testing and NumPy interop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback -- I changed my code to get rid of the numpy dependency.

delta_end = src_bin_end - dst_bin_of_end_center
norm = norm + _get_norm(delta_begin, delta_end, density, norm_type)
return norm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be optimized further by the following approximation:
Quantization error = (StepSize^2/12)Q + sum(P[i](BinCenter[i]-next_start_bin)^2) + sum(Pi]*(BinCenter[i] - end_start_bin)^2).
Q = sum(hist[next_start_bin:next_end_bin])

where the first sum is over the bins less than the start_bin and the second sum is over bins greater than the end bin. In this approximation, we only need to compute two indices: Where do the next_start_bin and next_end_bin map to in terms of the original histogram indices

durumu added a commit that referenced this pull request Jul 24, 2020
durumu added a commit that referenced this pull request Jul 27, 2020
@facebook-github-bot
Copy link
Contributor

@durumu merged this pull request in 7332c21.

@facebook-github-bot facebook-github-bot deleted the gh/durumu/9/head branch August 11, 2020 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants