Optimize LeakyReLU and PReLU 'forward' functions on the CPU #9206

btgraham · 2018-07-06T14:03:44Z

This looks like a totally cosmetic change, but for some reason it reduces the runtime by ~50% running in a single CPU thread.

import os
os.environ['OMP_NUM_THREADS']='1'  #Use one CPU thread
import torch, torch.nn as nn, time
def test_net(net,offset):
    net.eval()
    total=0
    with torch.no_grad():
        for _ in range(100):
            x = torch.randn(100,100,100)+offset
            start_time = time.time()
            y = net(x)
            total+=time.time()-start_time
    print(net, total*10, 'ms')

for offset in [-1,0,+1]:
    test_net(nn.LeakyReLU(),offset) 
    test_net(nn.PReLU(),offset)

apaszke

Wow, that's nice. Looks like the vectorization pass can't deal with the original code, but has no issue with the later version: https://godbolt.org/g/j5XJr3 (looks similar across many different compiler versions). It might be due to lack of -ffast-math (one path doesn't use multiplication, the other one does, so it has to be careful).

facebook-github-bot

@ssnl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: This looks like a totally cosmetic change, but for some reason it reduces the runtime by ~50% running in a single CPU thread. ``` import os os.environ['OMP_NUM_THREADS']='1' #Use one CPU thread import torch, torch.nn as nn, time def test_net(net,offset): net.eval() total=0 with torch.no_grad(): for _ in range(100): x = torch.randn(100,100,100)+offset start_time = time.time() y = net(x) total+=time.time()-start_time print(net, total*10, 'ms') for offset in [-1,0,+1]: test_net(nn.LeakyReLU(),offset) test_net(nn.PReLU(),offset) ``` Closes pytorch/pytorch#9206 Reviewed By: yf225 Differential Revision: D8749491 Pulled By: btgraham fbshipit-source-id: 3db8049dd151c0ba9ae1dd5c05bcc58bcab97e9a

…9206) Summary: This looks like a totally cosmetic change, but for some reason it reduces the runtime by ~50% running in a single CPU thread. ``` import os os.environ['OMP_NUM_THREADS']='1' #Use one CPU thread import torch, torch.nn as nn, time def test_net(net,offset): net.eval() total=0 with torch.no_grad(): for _ in range(100): x = torch.randn(100,100,100)+offset start_time = time.time() y = net(x) total+=time.time()-start_time print(net, total*10, 'ms') for offset in [-1,0,+1]: test_net(nn.LeakyReLU(),offset) test_net(nn.PReLU(),offset) ``` Closes pytorch#9206 Reviewed By: yf225 Differential Revision: D8749491 Pulled By: btgraham fbshipit-source-id: 3db8049dd151c0ba9ae1dd5c05bcc58bcab97e9a

Optimize LeakyReLU and PReLU forward() on the CPU

75af1ff

btgraham requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners July 6, 2018 14:03

apaszke approved these changes Jul 6, 2018

View reviewed changes

facebook-github-bot reviewed Jul 6, 2018

View reviewed changes

facebook-github-bot closed this in 067b270 Jul 6, 2018

vadimkantorov mentioned this pull request Jul 16, 2018

Windows CPU version much slower than Unix versions #9461

Closed

ssnl mentioned this pull request Sep 19, 2018

migrate PReLU to ATen #11758

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize LeakyReLU and PReLU 'forward' functions on the CPU #9206

Optimize LeakyReLU and PReLU 'forward' functions on the CPU #9206

Uh oh!

btgraham commented Jul 6, 2018 •

edited

Loading

Uh oh!

apaszke left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Optimize LeakyReLU and PReLU 'forward' functions on the CPU #9206

Optimize LeakyReLU and PReLU 'forward' functions on the CPU #9206

Uh oh!

Conversation

btgraham commented Jul 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apaszke left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

btgraham commented Jul 6, 2018 •

edited

Loading