This document summarizes key aspects of deep neural networks, including:
1. The number of parameters in a neural network is determined by the number and size of layers and weights.
2. Rectified linear units (ReLU) are used instead of sigmoid to avoid vanishing gradients, where ReLU keeps inputs above 0 and sets others to 0.
3. Weight initialization, such as Xavier and He initialization, helps training by setting initial weights to small random numbers.