Skip to content

Commit c8dc24d

Browse files
author
Taylor Robie
committed
Update on "[Profiler] Memory profiler part 5: Data flow graph"
The semantic meaning of a Tensor is tightly coupled to its lineage. The data flow graph allows us to identify temporary Tensors, masks, inputs, activations, and more. However one important nuance is that Tensors must be versioned; operations which mutate their inputs can also change the semantic meaning of said inputs. It is challenging to assemble a complete picture of the data flow in a PyTorch model because ops can, and often do, recursively call into other ops. For the purpose of memory profiling this is an implementation detail, so instead we traverse the op tree to identify top level ops and allocations and then coalesce their children, folding inputs and outputs into the top level Node. Differential Revision: [D40220391](https://our.internmc.facebook.com/intern/diff/D40220391/) [ghstack-poisoned]
2 parents 970fd46 + b18d09d commit c8dc24d

File tree

272 files changed

+16790
-7433
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

272 files changed

+16790
-7433
lines changed

.circleci/docker/build.sh

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -75,13 +75,12 @@ elif [[ "$image" == *rocm* ]]; then
7575
DOCKERFILE="${OS}-rocm/Dockerfile"
7676
fi
7777

78-
if [[ "$image" == *bionic* ]]; then
79-
CMAKE_VERSION=3.13.5
80-
fi
78+
# CMake 3.18 is needed to support CUDA17 language variant
79+
CMAKE_VERSION=3.18.5
8180

8281
TRAVIS_DL_URL_PREFIX="https://s3.amazonaws.com/travis-python-archives/binaries/ubuntu/14.04/x86_64"
8382
_UCX_COMMIT=31e74cac7bee0ef66bef2af72e7d86d9c282e5ab
84-
_UCC_COMMIT=12944da33f911daf505d9bbc51411233d0ed85e1
83+
_UCC_COMMIT=1c7a7127186e7836f73aafbd7697bbc274a77eee
8584

8685
# It's annoying to rename jobs every time you want to rewrite a
8786
# configuration, so we hardcode everything here rather than do it
@@ -209,7 +208,6 @@ case "$image" in
209208
;;
210209
pytorch-linux-focal-py3.7-gcc7)
211210
ANACONDA_PYTHON_VERSION=3.7
212-
CMAKE_VERSION=3.16.9 # Required for precompiled header support
213211
GCC_VERSION=7
214212
PROTOBUF=yes
215213
DB=yes

.circleci/scripts/build_android_gradle.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ export GRADLE_LOCAL_PROPERTIES=~/workspace/android/local.properties
2424
rm -f $GRADLE_LOCAL_PROPERTIES
2525
echo "sdk.dir=/opt/android/sdk" >> $GRADLE_LOCAL_PROPERTIES
2626
echo "ndk.dir=/opt/ndk" >> $GRADLE_LOCAL_PROPERTIES
27-
echo "cmake.dir=/usr" >> $GRADLE_LOCAL_PROPERTIES
27+
echo "cmake.dir=/usr/local" >> $GRADLE_LOCAL_PROPERTIES
2828

2929
retry () {
3030
$* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)

.github/actions/filter-test-configs/action.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,8 @@ runs:
5353
--test-matrix "${{ inputs.test-matrix }}" \
5454
--pr-number "${{ github.event.pull_request.number }}" \
5555
--tag "${{ steps.parse-ref.outputs.tag }}" \
56-
--event-name "${{ github.event_name }}"
56+
--event-name "${{ github.event_name }}" \
57+
--schedule "${{ github.event.schedule }}"
5758
5859
- name: Print the filtered test matrix
5960
shell: bash

.github/ci_commit_pins/vision.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
5b4f79d9ba8cbeeb8d6f0fbba3ba5757b718888b
1+
bfb474b9d3ffffec5c3a040c16bc77006f35a94e

.github/ci_commit_pins/xla.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
dd9b67ff0d6ba4da6a46ca1b22e35c98dbed0d77
1+
216d221f4d75ddfe9d0bd3ff2e8b92b39c67d381

.github/requirements/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@ The list of support files are as follows:
1717
test jobs to setup the conda environment
1818
* conda-env-macOS-X64. This is use by MacOS (x86-64) build and test
1919
jobs to setup the conda environment
20+
* conda-env-Linux-X64. This is used by Linux buck build and test jobs
21+
to setup the conda environment
2022
* Pip:
2123
* pip-requirements-macOS.txt. This is used by MacOS build and test jobs to
2224
setup the pip environment
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
cffi=1.15.1
2+
cmake=3.22.1
3+
mkl=2022.1.0
4+
mkl-include=2022.1.0
5+
ninja=1.10.2
6+
numpy=1.23.3
7+
pyyaml=6.0
8+
requests=2.28.1
9+
setuptools=65.5.0
10+
typing_extensions=4.3.0

.github/scripts/filter_test_configs.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ def parse_args() -> Any:
5050
parser.add_argument("--pr-number", type=str, help="the pull request number")
5151
parser.add_argument("--tag", type=str, help="the associated tag if it exists")
5252
parser.add_argument("--event-name", type=str, help="name of the event that triggered the job (pull, schedule, etc)")
53+
parser.add_argument("--schedule", type=str, help="cron schedule that triggered the job")
5354
return parser.parse_args()
5455

5556

@@ -188,7 +189,9 @@ def main() -> None:
188189
# No PR number, no tag, we can just return the test matrix as it is
189190
filtered_test_matrix = test_matrix
190191

191-
if args.event_name == "schedule":
192+
if args.event_name == "schedule" and args.schedule == '29 8 * * *':
193+
# we don't want to run the mem leack check or disabled tests on normal
194+
# periodically scheduled jobs, only the ones at this time
192195
filtered_test_matrix = set_periodic_modes(filtered_test_matrix)
193196

194197
# Set the filtered test matrix as the output

.github/scripts/generate_binary_build_matrix.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
CUDA_ARCHES = ["11.6", "11.7"]
1717

1818

19-
ROCM_ARCHES = ["5.1.1", "5.2"]
19+
ROCM_ARCHES = ["5.2", "5.3"]
2020

2121

2222
def arch_type(arch_version: str) -> str:

.github/workflows/_buck-build-test.yml

Lines changed: 2 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -21,29 +21,10 @@ jobs:
2121
distribution: 'temurin'
2222

2323
- name: Setup miniconda
24-
uses: conda-incubator/setup-miniconda@v2
24+
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
2525
with:
26-
auto-update-conda: true
2726
python-version: 3.8
28-
activate-environment: build
29-
30-
- name: Install dependencies
31-
uses: nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482
32-
with:
33-
timeout_minutes: 10
34-
max_attempts: 5
35-
command: |
36-
conda install -y \
37-
cffi=1.15.1 \
38-
cmake=3.22.1 \
39-
mkl=2022.1.0 \
40-
mkl-include=2022.1.0 \
41-
ninja=1.10.2 \
42-
numpy=1.23.3 \
43-
pyyaml=6.0 \
44-
requests=2.28.1 \
45-
setuptools=65.5.0 \
46-
typing_extensions=4.3.0
27+
environment-file: .github/requirements/conda-env-${{ runner.os }}-${{ runner.arch }}
4728

4829
- name: Install Buck
4930
uses: nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482

0 commit comments

Comments
 (0)