Skip to content

Conversation

@Baranowski
Copy link
Contributor

Fixes #40131

@dr-ci
Copy link

dr-ci bot commented Jun 27, 2020

💊 CI failures summary and remediations

As of commit ffd89d8 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 47 times.

@Baranowski Baranowski force-pushed the wbaranowski-max_pool_nan-40131 branch 2 times, most recently from 1d51752 to 53dfb79 Compare July 2, 2020 07:41
@Baranowski Baranowski marked this pull request as ready for review July 4, 2020 16:45
@Baranowski Baranowski requested a review from xwang233 July 4, 2020 16:46
@Baranowski
Copy link
Contributor Author

@xwang233, I didn't know who would be appropriate so I have marked you as a reviewer.

I will create the XLA issue after the first review.

Copy link
Collaborator

@xwang233 xwang233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Overall it looks good. Only the default max index needs to be changed.

@xwang233
Copy link
Collaborator

xwang233 commented Jul 4, 2020

I'm not familiar with the XLA test. If XLA failure is real, we can temporarily disable test on XLA and leave a TODO there. Seems like @onlyOnCPUAndCUDA does the work.

@Baranowski
Copy link
Contributor Author

Good call, @xwang233. Fixed

@Baranowski
Copy link
Contributor Author

Thanks for the quick and patient reviews, @xwang233. I won't have the time to work on this after Sunday next week (July 19th) so it would be brilliant to get this PR merged before then.

@xwang233
Copy link
Collaborator

xwang233 commented Jul 9, 2020

@Baranowski Thanks. I don't work at FB, so I can't merge this for you.

@xwang233 xwang233 requested a review from ezyang July 9, 2020 18:15
@ezyang
Copy link
Contributor

ezyang commented Jul 9, 2020

It would be good to get a performance comparison before and after here.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@Baranowski
Copy link
Contributor Author

I have added benchmarks in the most recent commit. I don't see significant differences.

Master:

Details
(pytorch-cuda-dev) wbaranowski@qgpu2:~/git/Quansight/vanilla/benchmarks/operator_benchmark$ python -m pt.pool_test --tag all
# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,1]_stride[2,1]_N1_C16_H32_W32_cpu
# Input: kernel: [3, 1], stride: [2, 1], N: 1, C: 16, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 43.277

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H32_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 74.007

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H32_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 103.433

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H64_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 101.649

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H64_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 158.317

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H32_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 77.478

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H32_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 116.714

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H64_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 116.661

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H64_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 158.889

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H32_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 83.278

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H32_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 126.931

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H64_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 113.920

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H64_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 182.647

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H32_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 98.402

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H32_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 111.891

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H64_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 117.598

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H64_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 241.957

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,1]_stride[2,1]_N1_C16_H32_W32_cpu
# Input: kernel: [3, 1], stride: [2, 1], N: 1, C: 16, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 39.386

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H32_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 54.870

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H32_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 54.467

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H64_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 54.567

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H64_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 54.727

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H32_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 64.878

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H32_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 63.164

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H64_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 63.417

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H64_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 62.819

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H32_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 55.215

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H32_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 54.947

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H64_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 55.018

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H64_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 58.489

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H32_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 67.708

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H32_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 64.613

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H64_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 64.880

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H64_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 65.123

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,1,3]_stride[2,1,2]_N1_C16_D16_H32_W32_cpu
# Input: kernel: [3, 1, 3], stride: [2, 1, 2], N: 1, C: 16, D: 16, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 324.005

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H32_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 2255.940

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H32_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 4465.090

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H64_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 4151.426

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H64_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 10130.852

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H32_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 2403.098

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H32_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 7265.930

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H64_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 6345.975

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H64_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 13842.293

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H32_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 1804.096

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H32_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 4145.099

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H64_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 3979.806

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H64_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 10293.466

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H32_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 2501.046

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H32_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 7369.280

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H64_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 6874.013

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H64_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 14725.793

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,1,3]_stride[2,1,2]_N1_C16_D16_H32_W32_cpu
# Input: kernel: [3, 1, 3], stride: [2, 1, 2], N: 1, C: 16, D: 16, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 39.616

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H32_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 78.150

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H32_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 74.670

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H64_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 71.217

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H64_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 77.309

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H32_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 88.279

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H32_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 85.710

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H64_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 87.579

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H64_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 139.081

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H32_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 74.413

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H32_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 83.784

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H64_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 77.414

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H64_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 100.437

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H32_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 90.312

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H32_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 94.083

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H64_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 93.086

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H64_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 145.713

(pytorch-cuda-dev) wbaranowski@qgpu2:~/git/Quansight/vanilla/benchmarks/operator_benchmark$

Modified version:

Details
(pytorch-cuda-dev) wbaranowski@qgpu2:~/git/Quansight/pytorch/benchmarks/operator_benchmark$ python -m pt.
pool_test --tag all
# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,1]_stride[2,1]_N1_C16_H32_W32_cpu
# Input: kernel: [3, 1], stride: [2, 1], N: 1, C: 16, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 42.339

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H32_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 74.987

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H32_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 103.705

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H64_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 102.937

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H64_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 178.586

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H32_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 94.048

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H32_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 106.972

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H64_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 112.960

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H64_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 159.002

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H32_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 83.364

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H32_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 110.988

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H64_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 115.071

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H64_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 169.171

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H32_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 83.448

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H32_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 112.358

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H64_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 114.631

# Benchmarking PyTorch: AdaptiveMaxPool2d
# Mode: Eager
# Name: AdaptiveMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H64_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 168.563

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,1]_stride[2,1]_N1_C16_H32_W32_cpu
# Input: kernel: [3, 1], stride: [2, 1], N: 1, C: 16, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 38.105

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H32_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 53.678

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H32_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 58.790

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H64_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 52.641

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N8_C32_H64_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 8, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 54.202

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H32_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 64.891

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H32_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 61.271

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H64_W32_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 62.074

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,2]_stride[2,2]_N16_C32_H64_W64_cpu
# Input: kernel: [3, 2], stride: [2, 2], N: 16, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 62.022

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H32_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 55.512

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H32_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 56.149

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H64_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 59.689

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N8_C32_H64_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 8, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 59.210

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H32_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 69.061

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H32_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 67.673

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H64_W32_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 67.671

# Benchmarking PyTorch: FractionalMaxPool2d
# Mode: Eager
# Name: FractionalMaxPool2d_kernel[3,3]_stride[2,2]_N16_C32_H64_W64_cpu
# Input: kernel: [3, 3], stride: [2, 2], N: 16, C: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 67.627

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,1,3]_stride[2,1,2]_N1_C16_D16_H32_W32_cpu
# Input: kernel: [3, 1, 3], stride: [2, 1, 2], N: 1, C: 16, D: 16, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 338.750

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H32_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 1580.897

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H32_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 4679.794

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H64_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 3772.246

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H64_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 9960.207

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H32_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 2765.616

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H32_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 6874.791

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H64_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 6816.528

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H64_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 14006.464

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H32_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 1693.150

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H32_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 4250.506

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H64_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 4019.097

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H64_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 9947.090

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H32_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 2356.209

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H32_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 7095.732

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H64_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 6791.053

# Benchmarking PyTorch: AdaptiveMaxPool3d
# Mode: Eager
# Name: AdaptiveMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H64_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 14499.389

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,1,3]_stride[2,1,2]_N1_C16_D16_H32_W32_cpu
# Input: kernel: [3, 1, 3], stride: [2, 1, 2], N: 1, C: 16, D: 16, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 40.421

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H32_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 73.324

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H32_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 76.495

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H64_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 75.037

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N8_C32_D32_H64_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 76.399

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H32_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 104.319

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H32_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 98.106

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H64_W32_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 96.371

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,2,3]_stride[2,2,2]_N16_C32_D32_H64_W64_cpu
# Input: kernel: [3, 2, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 96.063

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H32_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 80.156

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H32_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 81.476

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H64_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 81.561

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N8_C32_D32_H64_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 8, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 81.320

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H32_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 32, device: cpu
Forward Execution Time (us) : 100.050

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H32_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 32, W: 64, device: cpu
Forward Execution Time (us) : 96.316

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H64_W32_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 32, device: cpu
Forward Execution Time (us) : 93.940

# Benchmarking PyTorch: FractionalMaxPool3d
# Mode: Eager
# Name: FractionalMaxPool3d_kernel[3,3,3]_stride[2,2,2]_N16_C32_D32_H64_W64_cpu
# Input: kernel: [3, 3, 3], stride: [2, 2, 2], N: 16, C: 32, D: 32, H: 64, W: 64, device: cpu
Forward Execution Time (us) : 94.695

(pytorch-cuda-dev) wbaranowski@qgpu2:~/git/Quansight/pytorch/benchmarks/operator_benchmark$

@Baranowski Baranowski force-pushed the wbaranowski-max_pool_nan-40131 branch from fe71925 to ffd89d8 Compare July 11, 2020 08:21
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ezyang merged this pull request in 20f3051.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CPU {adaptive,fractional, } max_pool {1,2,3}d will crash with nan and -inf values

6 participants