Deep Neural Network

Deep Neural Network
RTSS JUN YOUNG PARK

Reference
◦ R을 활용한 기계 학습 – Brett Lantz 著
◦ 2017-1학기 ‘현대사회와 빅데이터‘ 교재
◦ 데이터 전처리/표본 분석 과정 참조

Number of Parameters
From the last presentation …
How many parameters in this linear model ?
X W b S(Y)Y
0
1
0
0
0
Dog !
x
Test data (Image)
[1024x768] image
5 Classes
𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑊𝑊 + 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝐵𝐵 = 𝐼𝐼 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼_𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 ∗ 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 + 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 = 3,932,165

Go Deep & Wide !
W1 W2 W3 ?
[784, 256] [256, 256] [256, 10]
Hidden Layer
[10]32
32
X Y
Invisible from the input/output.

Rectified Linear Units
◦ Why not Sigmoid ?
◦ Input signal may too near to 0 during back propagation. (Vanishing Gradient)
𝑅𝑅 𝑥𝑥 = �
𝑥𝑥, 𝑥𝑥 ≥ 0
0, 𝑥𝑥 < 0
𝜕𝜕
𝜕𝜕𝜕𝜕
{𝑅𝑅 𝑥𝑥 } = �
1, 𝑥𝑥 ≥ 0
0, 𝑥𝑥 < 0

Weight Initialization
◦ DBN (Deep Belief Networks )
◦ Process RBM for training each 2 layers
◦ After initialization -> We just need fine tuning(Training).
◦ Using Gaussian random number
◦ Xavier (2010)
◦ Divide Gaussian random number into number of inputs.
◦ He (2015)
◦ Divide the result of Xavier number into 2.

L2 Regularization
◦ Large weight may bend the model.
◦ To avoid ‘Large Weight’, We use the term below
ℒ =
1
𝑁𝑁
�
𝑖𝑖
𝐷𝐷 𝑆𝑆 𝑊𝑊𝑥𝑥𝑖𝑖 + 𝑏𝑏 , 𝐿𝐿𝑖𝑖 + 𝜆𝜆 � 𝑊𝑊2
0 ≤ 𝜆𝜆 ≤ 1 : Regularization strength

Dropout
◦ Forces the network to have redundant representation
While Testing : No Dropout While Training : Apply Dropout

Chain Rule
F GX Y
y = 𝑔𝑔(𝑓𝑓 𝑥𝑥 )
FX G’ *
y′
= 𝑔𝑔′
𝑓𝑓 𝑥𝑥 ∗ 𝑓𝑓𝑓(𝑥𝑥)
F’
X
Y’
◦ To make back propagation easier, We use operation graph like below.

Back Propagation
◦ Get derivatives using ‘Back Propagation’
+
𝑥𝑥
𝑦𝑦
𝑧𝑧
𝑧𝑧 = 𝑥𝑥 + 𝑦𝑦
𝜕𝜕𝑧𝑧
𝜕𝜕𝜕𝜕
=
𝜕𝜕𝜕𝜕
𝜕𝜕𝑦𝑦
= 1
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝑧𝑧
𝜕𝜕𝜕𝜕
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝑧𝑧
𝜕𝜕𝑦𝑦
x
𝑥𝑥
𝑦𝑦
𝑧𝑧
𝑧𝑧 = 𝑥𝑥𝑥𝑥
𝜕𝜕𝑧𝑧
𝜕𝜕𝜕𝜕
= 𝑦𝑦,
𝜕𝜕𝜕𝜕
𝜕𝜕𝑦𝑦
= 𝑥𝑥
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝑧𝑧
𝜕𝜕𝜕𝜕
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝑧𝑧
𝜕𝜕𝑦𝑦
𝑦𝑦 �
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝑥𝑥 �
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
For signal 𝐿𝐿 …

Practical Use
◦ Breast cancer diagnosis using ‘Deep Neural Network’
◦ The example from the book ‘Machine Learning with R’
◦ Using the dataset from ‘University of Wisconsin’
◦ The dataset includes 32 features
◦ Diagnosis, Radius, Perimeter, Area … and so on

Import/Define Methods
◦ Import packages for NumPy and TF
◦ Define the method for normalization
𝑧𝑧𝑛𝑛 =
𝑥𝑥𝑛𝑛 − min(𝒙𝒙)
max 𝒙𝒙 − min(𝒙𝒙)

Import Dataset
◦ Dataset from University of Wisconsin.
◦ Exclude unused feature (ID).
◦ Divide dataset for x and y.

One-Hot Encoding
‘M’
[1, 0]
[0, 1]
Malignant
Benign

Build Session
◦ Can control forced/unforced.
◦ Restore previous trained weights.
◦ Write log for TensorBoard.

Training Neurons
◦ 10001 steps per a run.
◦ Add summary for Tensorboard.

Save Results and Get Accuracy
◦ Save previous training data to keep current weight and bias
◦ Each run trains 10001 times

Result #1
<1st Attempt> <2nd Attempt>

Attempt more …
To use Xavier initializer

Result #2
96.27% -> 97.01% 97.01% -> 97.76%

Self Test
◦ 모델의 Parameter 수는 어떻게 결정되는지 설명하라.
◦ ReLU 함수의 개형과 그 미분의 결과는 어떻게 되는지 Sigmoid 함수와 비교하여 설명하라.
◦ Weight Initialization 의 목적과 그 방법을 설명하라.
◦ L2 Regularization 의 목적과 그 원리를 설명하라.
◦ Dropout 은 왜 필요한가 ? 또 훈련/시험시에 어떻게 설정해야 적절한가 ?
◦ NN 에 있어 Back Propagation 이 왜 유리한가?
◦ Ensemble Learning 에 대하여 설명하라.

Deep Neural Network

More Related Content

What's hot

Similar to Deep Neural Network

More from Jun Young Park

Recently uploaded

Deep Neural Network