Skip to content

Conversation

@colesbury
Copy link
Member

This is a few files taken from #8919. They're unchanged from the latest versions of that PR.

This is part of https://github.com/pytorch/pytorch/pull/8919. It's
separated to make it easier to merge the PR in pieces.

There are a few major changes to DispatchStub

 - The environment variable ATEN_CPU_CAPABILITY overrides the CPU
   capability detection code (Previous ATEN_DISABLE_AVX/AVX2)

 - DispatchStub is defined in the generic native code instead of the
   CPU_CAPABILITY_DEFAULT kernel.

This is part of pytorch#8919. It's
separated to make it easier to merge the PR in pieces.

There are a few major changes to DispatchStub

 - The environment variable ATEN_CPU_CAPABILITY overrides the CPU
   capability detection code (Previous ATEN_DISABLE_AVX/AVX2)

 - DispatchStub is defined in the generic native code instead of the
   CPU_CAPABILITY_DEFAULT kernel.
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colesbury has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colesbury has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colesbury has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@cpuhrsch
Copy link
Contributor

If you're changing the capability variable name, please also edit ".jenkins/pytorch/test.sh" accordingly.

zdevito pushed a commit to zdevito/ATen that referenced this pull request Jul 19, 2018
Summary:
This is a few files taken from pytorch/pytorch#8919. They're unchanged from the latest versions of that PR.

```
This is part of pytorch/pytorch#8919. It's
separated to make it easier to merge the PR in pieces.

There are a few major changes to DispatchStub

 - The environment variable ATEN_CPU_CAPABILITY overrides the CPU
   capability detection code (Previous ATEN_DISABLE_AVX/AVX2)

 - DispatchStub is defined in the generic native code instead of the
   CPU_CAPABILITY_DEFAULT kernel.
```
Pull Request resolved: pytorch/pytorch#9579

Differential Revision: D8909000

Pulled By: colesbury

fbshipit-source-id: fdeb606270b06acdab3c01dba97ec9d81584ecc0
colesbury added a commit to colesbury/pytorch that referenced this pull request Jul 20, 2018
facebook-github-bot pushed a commit that referenced this pull request Jul 20, 2018
Summary:
This reverts commit bcf0bf4.

The commit was causing issues for some internal FB projects.
Pull Request resolved: #9614

Reviewed By: Yangqing

Differential Revision: D8929552

Pulled By: colesbury

fbshipit-source-id: ae9026ad8762a4c5de401273694b4c878fc241a6
colesbury added a commit to colesbury/pytorch that referenced this pull request Jul 21, 2018
This is a modification of the strategy from
pytorch#8919 and
pytorch#9579.

Previously, the CPU architecture-specific kernels self-registered with
the DispatchStub. When linking as part of a static library, this requires
the flag --whole-archive to be passed to the linker to ensure that the
object files for the kernels are included. Caffe2 and TensorFlow use that
strategy.

We ran into some issues with --whole-archive blowing up the binary size
of some downstream projects in Facebook. This PR avoids --whole-archive
for CPU kernels. The downside is that the generic code needs to be aware
of whether kernels are compiled with AVX and with AVX2 (via
HAVE_AVX_CPU_DEFINITION and HAVE_AVX2_CPU_DEFINITION).

The CUDA kernels still self-register with DispatchStub because the CPU
library is not aware of whether the CUDA library will be available at
runtime.

There are a few major changes to DispatchStub

 - The environment variable ATEN_CPU_CAPABILITY overrides the CPU
   capability detection code (Previous ATEN_DISABLE_AVX/AVX2)

 - DispatchStub is defined in the generic native code instead of the
   CPU_CAPABILITY_DEFAULT kernel.
facebook-github-bot pushed a commit that referenced this pull request Jul 23, 2018
Summary:
This is a modification of the strategy from #8919 and #9579.

```
Previously, the CPU architecture-specific kernels self-registered with
the DispatchStub. When linking as part of a static library, this requires
the flag --whole-archive to be passed to the linker to ensure that the
object files for the kernels are included. Caffe2 and TensorFlow use that
strategy.

We ran into some issues with --whole-archive blowing up the binary size
of some downstream projects in Facebook. This PR avoids --whole-archive
for CPU kernels. The downside is that the generic code needs to be aware
of whether kernels are compiled with AVX and with AVX2 (via
HAVE_AVX_CPU_DEFINITION and HAVE_AVX2_CPU_DEFINITION).

The CUDA kernels still self-register with DispatchStub because the CPU
library is not aware of whether the CUDA library will be available at
runtime.

There are a few major changes to DispatchStub

 - The environment variable ATEN_CPU_CAPABILITY overrides the CPU
   capability detection code (Previous ATEN_DISABLE_AVX/AVX2)

 - DispatchStub is defined in the generic native code instead of the
   CPU_CAPABILITY_DEFAULT kernel.
```
Pull Request resolved: #9664

Differential Revision: D8943350

Pulled By: colesbury

fbshipit-source-id: 329229b0ee9ff94fc001b960287814bd734096ef
zdevito pushed a commit to zdevito/ATen that referenced this pull request Jul 23, 2018
Summary:
This is a modification of the strategy from pytorch/pytorch#8919 and pytorch/pytorch#9579.

```
Previously, the CPU architecture-specific kernels self-registered with
the DispatchStub. When linking as part of a static library, this requires
the flag --whole-archive to be passed to the linker to ensure that the
object files for the kernels are included. Caffe2 and TensorFlow use that
strategy.

We ran into some issues with --whole-archive blowing up the binary size
of some downstream projects in Facebook. This PR avoids --whole-archive
for CPU kernels. The downside is that the generic code needs to be aware
of whether kernels are compiled with AVX and with AVX2 (via
HAVE_AVX_CPU_DEFINITION and HAVE_AVX2_CPU_DEFINITION).

The CUDA kernels still self-register with DispatchStub because the CPU
library is not aware of whether the CUDA library will be available at
runtime.

There are a few major changes to DispatchStub

 - The environment variable ATEN_CPU_CAPABILITY overrides the CPU
   capability detection code (Previous ATEN_DISABLE_AVX/AVX2)

 - DispatchStub is defined in the generic native code instead of the
   CPU_CAPABILITY_DEFAULT kernel.
```
Pull Request resolved: pytorch/pytorch#9664

Differential Revision: D8943350

Pulled By: colesbury

fbshipit-source-id: 329229b0ee9ff94fc001b960287814bd734096ef
jramseyer pushed a commit to jramseyer/pytorch that referenced this pull request Jul 30, 2018
Summary:
This is a few files taken from pytorch#8919. They're unchanged from the latest versions of that PR.

```
This is part of pytorch#8919. It's
separated to make it easier to merge the PR in pieces.

There are a few major changes to DispatchStub

 - The environment variable ATEN_CPU_CAPABILITY overrides the CPU
   capability detection code (Previous ATEN_DISABLE_AVX/AVX2)

 - DispatchStub is defined in the generic native code instead of the
   CPU_CAPABILITY_DEFAULT kernel.
```
Pull Request resolved: pytorch#9579

Differential Revision: D8909000

Pulled By: colesbury

fbshipit-source-id: fdeb606270b06acdab3c01dba97ec9d81584ecc0
jramseyer pushed a commit to jramseyer/pytorch that referenced this pull request Jul 30, 2018
…ytorch#9614)

Summary:
This reverts commit bcf0bf4.

The commit was causing issues for some internal FB projects.
Pull Request resolved: pytorch#9614

Reviewed By: Yangqing

Differential Revision: D8929552

Pulled By: colesbury

fbshipit-source-id: ae9026ad8762a4c5de401273694b4c878fc241a6
jramseyer pushed a commit to jramseyer/pytorch that referenced this pull request Jul 30, 2018
Summary:
This is a modification of the strategy from pytorch#8919 and pytorch#9579.

```
Previously, the CPU architecture-specific kernels self-registered with
the DispatchStub. When linking as part of a static library, this requires
the flag --whole-archive to be passed to the linker to ensure that the
object files for the kernels are included. Caffe2 and TensorFlow use that
strategy.

We ran into some issues with --whole-archive blowing up the binary size
of some downstream projects in Facebook. This PR avoids --whole-archive
for CPU kernels. The downside is that the generic code needs to be aware
of whether kernels are compiled with AVX and with AVX2 (via
HAVE_AVX_CPU_DEFINITION and HAVE_AVX2_CPU_DEFINITION).

The CUDA kernels still self-register with DispatchStub because the CPU
library is not aware of whether the CUDA library will be available at
runtime.

There are a few major changes to DispatchStub

 - The environment variable ATEN_CPU_CAPABILITY overrides the CPU
   capability detection code (Previous ATEN_DISABLE_AVX/AVX2)

 - DispatchStub is defined in the generic native code instead of the
   CPU_CAPABILITY_DEFAULT kernel.
```
Pull Request resolved: pytorch#9664

Differential Revision: D8943350

Pulled By: colesbury

fbshipit-source-id: 329229b0ee9ff94fc001b960287814bd734096ef
goodlux pushed a commit to goodlux/pytorch that referenced this pull request Aug 15, 2018
Summary:
This is a few files taken from pytorch#8919. They're unchanged from the latest versions of that PR.

```
This is part of pytorch#8919. It's
separated to make it easier to merge the PR in pieces.

There are a few major changes to DispatchStub

 - The environment variable ATEN_CPU_CAPABILITY overrides the CPU
   capability detection code (Previous ATEN_DISABLE_AVX/AVX2)

 - DispatchStub is defined in the generic native code instead of the
   CPU_CAPABILITY_DEFAULT kernel.
```
Pull Request resolved: pytorch#9579

Differential Revision: D8909000

Pulled By: colesbury

fbshipit-source-id: fdeb606270b06acdab3c01dba97ec9d81584ecc0
goodlux pushed a commit to goodlux/pytorch that referenced this pull request Aug 15, 2018
…ytorch#9614)

Summary:
This reverts commit bcf0bf4.

The commit was causing issues for some internal FB projects.
Pull Request resolved: pytorch#9614

Reviewed By: Yangqing

Differential Revision: D8929552

Pulled By: colesbury

fbshipit-source-id: ae9026ad8762a4c5de401273694b4c878fc241a6
goodlux pushed a commit to goodlux/pytorch that referenced this pull request Aug 15, 2018
Summary:
This is a modification of the strategy from pytorch#8919 and pytorch#9579.

```
Previously, the CPU architecture-specific kernels self-registered with
the DispatchStub. When linking as part of a static library, this requires
the flag --whole-archive to be passed to the linker to ensure that the
object files for the kernels are included. Caffe2 and TensorFlow use that
strategy.

We ran into some issues with --whole-archive blowing up the binary size
of some downstream projects in Facebook. This PR avoids --whole-archive
for CPU kernels. The downside is that the generic code needs to be aware
of whether kernels are compiled with AVX and with AVX2 (via
HAVE_AVX_CPU_DEFINITION and HAVE_AVX2_CPU_DEFINITION).

The CUDA kernels still self-register with DispatchStub because the CPU
library is not aware of whether the CUDA library will be available at
runtime.

There are a few major changes to DispatchStub

 - The environment variable ATEN_CPU_CAPABILITY overrides the CPU
   capability detection code (Previous ATEN_DISABLE_AVX/AVX2)

 - DispatchStub is defined in the generic native code instead of the
   CPU_CAPABILITY_DEFAULT kernel.
```
Pull Request resolved: pytorch#9664

Differential Revision: D8943350

Pulled By: colesbury

fbshipit-source-id: 329229b0ee9ff94fc001b960287814bd734096ef
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants