Skip to content

Conversation

@apaszke
Copy link
Contributor

@apaszke apaszke commented Sep 19, 2017

A rebase + minor fixes on top of #2137.

cc @ruotianluo

@albanD
Copy link
Collaborator

albanD commented Sep 20, 2017

I would have a rather general question:
What is the call chain that ends up calling these THVector_(*) functions?

@fmassa
Copy link
Member

fmassa commented Sep 20, 2017

I'm also curious, did you benchmark those loop-unrollings to see if it brings noticeable speed benefits?

@apaszke
Copy link
Contributor Author

apaszke commented Sep 20, 2017

The original PR contains some benchmarks and I double checked sigmoid to ensure they still hold. Sigmoid on 10000 x 10000 tensor takes 700 ms before this PR and 40 ms after

@apaszke
Copy link
Contributor Author

apaszke commented Sep 20, 2017

It doesn't necessarily have to be the loop unrolling (which doesn't hurt for sure). It might just the fact that we have more than a single thread iterating over the tensor

@apaszke apaszke merged commit 16a3de0 into master Sep 20, 2017
@apaszke apaszke deleted the omp_pointwise branch September 20, 2017 15:23
}
else if(value == 3){
TH_TENSOR_APPLY2(real, t, real, r_, *r__data = *t_data * *t_data * *t_data;);
TH_TENSOR_APPLY2(real, r_, real, t, *r__data = *t_data * *t_data * *t_data;);

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants