Skip to content

Commit 2f34442

Browse files
update perf numbers
1 parent 4dd9ecf commit 2f34442

File tree

1 file changed

+18
-18
lines changed

1 file changed

+18
-18
lines changed

PyTorch/SpeechSynthesis/Tacotron2/README.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ and encapsulates some dependencies. Aside from these dependencies, ensure you
7272
have the following components:
7373

7474
* [NVIDIA Docker](https://github.com/NVIDIA/nvidia-docker)
75-
* [PyTorch 18.12.1-py3 NGC container](https://ngc.nvidia.com/registry/nvidia-pytorch) or newer
75+
* [PyTorch 19.05-py3 NGC container](https://ngc.nvidia.com/registry/nvidia-pytorch) or newer
7676
* [NVIDIA Volta based GPU](https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/)
7777

7878

@@ -379,7 +379,7 @@ and accuracy in training and inference.
379379
380380
## Training accuracy results
381381
Our results were obtained by running the `./platform/train_{tacotron2,waveglow}_{FP16,FP32}_DGX1_16GB_8GPU.sh`
382-
training script in the PyTorch-18.12.1-py3 NGC container on NVIDIA DGX-1 with 8x V100 16G GPUs.
382+
training script in the PyTorch-19.05-py3 NGC container on NVIDIA DGX-1 with 8x V100 16G GPUs.
383383
384384
All of the results were produced using the `train.py` as described in the
385385
[Training process](#training-process) section of this document.
@@ -402,7 +402,7 @@ WaveGlow FP32 loss - batch size 4 (mean and std over 16 runs)
402402
403403
## Training performance results
404404
Our results were obtained by running the `./platform/train_{tacotron2,waveglow}_{FP16,FP32}_DGX1_16GB_8GPU.sh`
405-
training script in the PyTorch-18.12.1-py3 NGC container on NVIDIA DGX-1 with
405+
training script in the PyTorch-19.05-py3 NGC container on NVIDIA DGX-1 with
406406
8x V100 16G GPUs. Performance numbers (in input tokens per second for
407407
Tacotron 2 and output samples per second for WaveGlow) were averaged over
408408
an entire training epoch.
@@ -412,17 +412,17 @@ for mixed precision and FP32 training, respectively.
412412
413413
|Number of GPUs|Mixed precision tokens/sec|FP32 tokens/sec|Speed-up with mixed precision|Multi-gpu weak scaling with mixed precision|Multi-gpu weak scaling with FP32|
414414
|---:|---:|---:|---:|---:|---:|
415-
|**1**|2,424|1,826|1.33|1.00|1.00|
416-
|**4**|7,280|5,944|1.22|3.00|3.26|
417-
|**8**|12,742|10,843|1.18|5.26|5.94|
415+
|**1**|2,554|1,740|1.47|1.00|1.00|
416+
|**4**|7,768|5,683|1.37|3.04|3.27|
417+
|**8**|12,524|10,484|1.19|4.90|6.03|
418418
419419
The following table shows the results for WaveGlow, with batch size equal 4 and 8 for mixed precision and FP32 training, respectively.
420420
421421
|Number of GPUs|Mixed precision samples/sec|FP32 samples/sec|Speed-up with mixed precision|Multi-gpu weak scaling with mixed precision|Multi-gpu weak scaling with FP32|
422422
|---:|---:|---:|---:|---:|---:|
423-
|**1**| 70,362 | 35,180 | 2.00 | 1.00 | 1.00 |
424-
|**4**| 215,380 | 118,961 | 1.81 | 3.06 | 3.38 |
425-
|**8**| 500,375 | 257,687 | 1.94 | 7.11 | 7.32 |
423+
|**1**| 76,686 | 36,602 | 2.10 | 1.00 | 1.00 |
424+
|**4**| 260,826 | 124,514 | 2.09 | 3.40 | 3.40 |
425+
|**8**| 566,471 | 264,138 | 2.14 | 7.39 | 7.22 |
426426
427427
To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above.
428428
@@ -432,21 +432,21 @@ This table shows the expected training time for convergence for Tacotron 2 (1500
432432
433433
|Number of GPUs|Expected training time with mixed precision|Expected training time with FP32|Speed-up with mixed precision|
434434
|---:|---:|---:|---:|
435-
|**1**| 208.00 | 288.03 | 1.38 |
436-
|**4**| 67.53 | 84.20 | 1.25 |
437-
|**8**| 33.14 | 44.00 | 1.33 |
435+
|**1**| 197.39 | 302.32 | 1.38 |
436+
|**4**| 63.29 | 88.07 | 1.25 |
437+
|**8**| 33.72 | 45.51 | 1.33 |
438438
439439
This table shows the expected training time for convergence for WaveGlow (1000 epochs).
440440
441441
|Number of GPUs|Expected training time with mixed precision|Expected training time with FP32|Speed-up with mixed precision|
442442
|---:|---:|---:|---:|
443-
|**1**| 437.03 | 814.30 | 1.86 |
444-
|**4**| 108.26 | 223.04 | 2.06 |
445-
|**8**| 54.83 | 109.96 | 2.01 |
443+
|**1**| 400.99 | 782.67 | 1.95 |
444+
|**4**| 89.40 | 213.09 | 2.38 |
445+
|**8**| 48.43 | 107.27 | 2.21 |
446446
447447
## Inference performance results
448448
Our results were obtained by running the `./inference.py` inference script in the
449-
PyTorch-18.12.1-py3 NGC container on NVIDIA DGX-1 with 8x V100 16G GPUs.
449+
PyTorch-19.05-py3 NGC container on NVIDIA DGX-1 with 8x V100 16G GPUs.
450450
Performance numbers (in input tokens per second for Tacotron 2 and output
451451
samples per second for WaveGlow) were averaged over 16 runs.
452452
@@ -456,15 +456,15 @@ Results are measured in the number of input tokens per second.
456456
457457
|Number of GPUs|Mixed precision tokens/sec|FP32 tokens/sec|Speed-up with mixed precision|
458458
|---:|---:|---:|---:|
459-
|**1**|170|178|0.96|
459+
|**1**|130|150|0.87|
460460
461461
462462
This table shows the inference performance results for WaveGlow.
463463
Results are measured in the number of output audio samples per second.<sup>1</sup>
464464
465465
|Number of GPUs|Mixed precision samples/sec|FP32 samples/sec|Speed-up with mixed precision|
466466
|---:|---:|---:|---:|
467-
|**1**|537525|404206|1.33|
467+
|**1**|435110|400097|1.09|
468468
469469
<sup>1</sup>With sampling rate equal to 22050, one second of audio is generated from 22050 samples.
470470

0 commit comments

Comments
 (0)