Binomial with vectorized total count #148

neerajprad · 2018-04-18T06:34:41Z

Based on a suggestion from @fritzo prompted by pyro-ppl/pyro#675, this attempts to restrict same total_count only to the sample and enumerate_support methods for the Binomial distribution, so that we can score samples with vectorized total_count. Happy to hear suggestions if there is a non-hacky way to implement sampling for vectorized total_count (numpy has this!).

fritzo

Math looks good. Could you update the docstring?

fritzo · 2018-04-18T16:30:07Z

torch/distributions/binomial.py

+
    def sample(self, sample_shape=torch.Size()):
-        shape = self._extended_shape(sample_shape) + (self.total_count,)
+        total_count = self._get_homogeneous_count()


Gosh I guess since we're already summing Bernoullis we could draw inhomogeneous samples via

def sample(self, sample_shape=torch.Size()): with torch.no_grad(): max_count = self.total_count.max().item() shape = self._extended_shape(sample_shape) + (max_count,) bernoullis = torch.bernoulli(self.probs.unsqueeze(-1).expand(shape)) if self.total_count.min() != max_count: arange = torch.arange(max_count, out=self.total_count.new_empty(max_count)) bernoullis *= (arange < self.total_count.unsqueeze(-1)).type_as(bernoullis) return bernoullis.sum(dim=-1)

WDYT?

I was thinking about it. :) It may lead to us to creating some large intermediate tensors, but it should work fine! Let me make that change and then we can get in @apaszke's opinion.

neerajprad · 2018-04-18T18:41:20Z

torch/distributions/binomial.py

+            bernoullis = torch.bernoulli(self.probs.unsqueeze(-1).expand(shape))
+            if self.total_count.min() != max_count:
+                arange = torch.arange(max_count, out=self.total_count.new_empty(max_count))
+                bernoullis *= (arange < self.total_count.unsqueeze(-1)).type_as(bernoullis)


Super neat trick that you came up with, @fritzo. :)

neerajprad · 2018-04-18T18:44:23Z

@fritzo - could you take another look? I added a few tests, and used your suggestion for vectorized sampling over N.

fritzo

Looks good. We could use a couple more tests as commented below.

fritzo · 2018-04-18T18:47:27Z

torch/distributions/binomial.py


    def sample(self, sample_shape=torch.Size()):
-        shape = self._extended_shape(sample_shape) + (self.total_count,)
+        max_count = int(self.total_count.max().item())


I'd move this into the no_grad() context just to be safe.

fritzo · 2018-04-18T18:52:13Z

test/test_distributions.py

        self.assertEqual(dist.log_prob(self.tensor_sample_1).size(), torch.Size((3, 2)))
        self.assertRaises(ValueError, dist.log_prob, self.tensor_sample_2)

+    def test_binomial_shape_vectorized_n(self):


It would also be nice to draw a ton of samples and assert that (sample <= total_count).all(), just to make sure we got the masking correct. Maybe also numerically test the mean of Binomial(total_count=torch.tensor([0,1,2,5,10]), probs=0.5)?

Good idea. We can test it at the boundary itself, i.e. with p=1, that should be sufficient to validate. I did that locally, but I should add it as a test. Will add the other test for the mean too.

fritzo

Looks great! This should also serve as a template for how to support inhomogeneous Multinomial.

fritzo · 2018-04-18T19:50:48Z

test/test_distributions.py

+    def test_binomial_vectorized_count(self):
+        set_rng_seed(0)
+        total_count = torch.tensor([[4., 7.], [3., 8.]])
+        bin0 = Binomial(total_count, torch.tensor(1.))


With a probs this high, all you need is a single sample 😉

Yup, this is doing an exact match! I am drawing way more samples for bin1 with p=0.5 to make sure that the invariance holds.

neerajprad · 2018-04-18T19:58:31Z

cc. @1Reinier

neerajprad · 2018-04-18T20:00:20Z

Sent upstream to pytorch#6720.

Allowing for inhomogeneous counts in Binomial log prob

f0da71f

neerajprad force-pushed the binomial-vecn branch from 35d252a to f0da71f Compare April 18, 2018 06:39

remove int

05da4ae

neerajprad requested review from alicanb and fritzo April 18, 2018 06:42

fritzo suggested changes Apr 18, 2018

View reviewed changes

modify sample

b16a1ef

neerajprad commented Apr 18, 2018

View reviewed changes

fix unused imports

c436f45

neerajprad changed the title ~~Allowing for vectorized counts in Binomial log prob~~ Binomial with vectorized total count Apr 18, 2018

fritzo approved these changes Apr 18, 2018

View reviewed changes

add more test

b82b005

fritzo approved these changes Apr 18, 2018

View reviewed changes

fritzo reviewed Apr 18, 2018

View reviewed changes

neerajprad mentioned this pull request Apr 18, 2018

Allowing for vectorized counts in Binomial Distribution pytorch/pytorch#6720

Merged

neerajprad closed this Apr 18, 2018

Binomial with vectorized total count #148

Binomial with vectorized total count #148

Uh oh!

Conversation

neerajprad commented Apr 18, 2018

Uh oh!

fritzo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

neerajprad commented Apr 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fritzo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fritzo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

neerajprad commented Apr 18, 2018

Uh oh!

neerajprad commented Apr 18, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

neerajprad commented Apr 18, 2018 •

edited

Loading

fritzo left a comment •

edited

Loading