UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 1
Lecture 1: Introduction to Deep Learning
Efstratios Gavves
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 2
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 2
o Machine Learning 1
o Calculus, Linear Algebra
◦ Derivatives, integrals
◦ Matrix operations
◦ Computing lower bounds, limits
o Probability Theory, Statistics
o Advanced programming
o Time, patience & drive
Prerequisites
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 3
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 3
o Design and Program Deep Neural Networks
o Advanced Optimizations (SGD, Nestorov’s Momentum, RMSprop, Adam) and
Regularizations
o Convolutional and Recurrent Neural Networks (feature invariance and equivariance)
o Unsupervised Learning and Autoencoders
o Generative models (RBMs, Variational Autoencoders, Generative Adversarial Networks)
o Bayesian Neural Networks and their Applications
o Advanced Temporal Modelling, Credit Assignment, Neural Network Dynamics
o Biologically-inspired Neural Networks
o Deep Reinforcement Learning
Learning Goals
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 4
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 4
o 3 individual practicals (PyTorch)
◦ Practical 1: Convnets and Optimizations
◦ Practical 2: Recurrent Networks
◦ Practical 3: Generative Models
o 1 group presentation of an existing paper (1 group=3 persons)
◦ We’ll provide a list of papers or choose another paper (your own?)
◦ By next Monday make your team: we will prepare a Google Spreadsheet
Practicals
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 5
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 5
Grading
Total Grade
100%
Final Exam
50%
Total practicals
50%
Practical 1
15%
Practical 2
15%
Practical 3
15%
Poster
5%
+0.5 Bonus
Piazza Grade
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 6
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 6
o Course: Theory (4 hours per week) + Labs (4 hours per week)
◦ All material on http://uvadlc.github.io
◦ Book: Deep Learning by I. Goodfellow, Y. Bengio, A. Courville (available online)
o Live interactions via Piazza. Please, subscribe today!
◦ Link: https://piazza.com/university_of_amsterdam/fall2018/uvadlc/home
o Practicals are individual!
◦ More than encouraged to cooperate but not copy
The top 3 Piazza contributors get +0.5 grade
◦ Plagiarism checks on reports and code  Do not cheat!
Overview
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 7
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 7
o Efstratios Gavves
◦ Assistant Professor, QUVA Deep Vision Lab (C3.229)
◦ Temporal Models, Spatiotemporal Deep Learning, Video Analysis
o Teaching Assistants
◦ Kirill Gavrilyuk, Berkay Kicanaoglu, Tom Runia, Jorn Peters, Maurice Weiler
Who we are and how to reach us
Me :P Kirill Berkay Jorn
Tom Maurice
@egavves
Efstratios Gavves
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 8
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 8
o Applications of Deep Learning in Vision, Robotics, Game AI, NLP
o A brief history of Neural Networks and Deep Learning
o Neural Networks as modular functions
Lecture Overview
UVA DEEP LEARNING COURSE
EFSTRATIOS GAVVES
INTRODUCTION TO DEEP LEARNING - 9
Applications of
Deep Learning
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 10
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 10
Deep Learning in practice
YouTube Youtube Website
Youtube Youtube
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 11
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 11
o Vision is ultra challenging!
◦ For 256x256 resolution  2524,288
of possible images (1024
stars in the universe)
◦ Large visual object variations (viewpoints, scales, deformations, occlusions)
◦ Large semantic object variations
o Robotics is typically considered in controlled environments
o Game AI involves extreme number of possible
games states (101048
possible GO games)
o NLP is extremely high dimensional and vague
(just for English: 150K words)
Why should we be impressed?
Inter-class variation
Intra-class overlap
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 12
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 12
Deep Learning even for the arts
UVA DEEP LEARNING COURSE
EFSTRATIOS GAVVES
INTRODUCTION TO DEEP LEARNING - 13
A brief history of
Neural Networks &
Deep Learning
Frank
Rosenblatt
Charles W.
Wightman
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 14
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 14
First appearance (roughly)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 15
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 15
o Rosenblatt proposed Perceptrons for binary classifications
◦ One weight 𝑤𝑖 per input 𝑥𝑖
◦ Multiply weights with respective inputs and add bias 𝑥0 =+1
◦ If result larger than threshold return 1, otherwise 0
Perceptrons
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 16
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 16
o Rosenblatt’s innovation was mainly the learning algorithm for perceptrons
o Learning algorithm
◦ Initialize weights randomly
◦ Take one sample 𝑥𝑖and predict 𝑦𝑖
◦ For erroneous predictions update weights
◦ If prediction ෝ
𝑦𝑖 = 0 and ground truth 𝑦𝑖 = 1, increase weights
◦ If prediction ෝ
𝑦𝑖 = 1 and ground truth 𝑦𝑖 = 0, decrease weights
◦ Repeat until no errors are made
Training a perceptron
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 17
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 17
o 1 perceptron == 1 decision
o What about multiple decisions?
◦ E.g. digit classification
o Stack as many outputs as the
possible outcomes into a layer
◦ Neural network
o Use one layer as input to the next layer
◦ Add nonlinearities between layers
◦ Multi-layer perceptron (MLP)
From a single layer to multiple layers
1-layer neural network
Multi-layer perceptron
What could be a problem with perceptrons?
A. They can only return one output, so only work for binary problems
B. They are linear machines, so can only solve linear problems
C. They can only work for vector inputs
D. They are too complex to train, so they can work with big computers only
Votes: 0
Time: 60s
The question will open when you
start your session and slideshow.
Internet This text box will be used to describe the different message sending methods.
TXT The applicable explanations will be inserted after you have started a session.
This presentation has been loaded without the Shakespeak add-in.
Want to download the add-in for free? Go to http://shakespeak.com/en/free-download/.
What could be a problem with perceptrons?
Closed
A.
B.
C.
D.
They can only return one output, so only work for binary problems
They are linear machines, so can only solve linear problems
They can only work for vector inputs
They are too complex to train, so they can work with big computers
only
25.0%
50.0%
75.0%
100.0%
We will set these example results to zero once
you've started your session and your slide show.
In the meantime, feel free to change the looks of
your results (e.g. the colors).
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 20
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 20
o However, the exclusive or (XOR) cannot be solved by perceptrons
◦ [Minsky and Papert, “Perceptrons”, 1969]
◦ 0 𝑤1 + 0𝑤2 < 𝜃 → 0 < 𝜃
◦ 0 𝑤1 + 1𝑤2 > 𝜃 → 𝑤2 > 𝜃
◦ 1 𝑤1 + 0𝑤2 > 𝜃 → 𝑤1 > 𝜃
◦ 1 𝑤1 + 1𝑤2 < 𝜃 → 𝑤1 + 𝑤2 < 𝜃
XOR & Single-layer Perceptrons
Input 1 Input 2 Output
1 1 0
1 0 1
0 1 1
0 0 0
Input 1 Input 2
Output
𝑤1 𝑤2
Inconsistent!!
The classification boundary to solve XOR is not a line!!
Graphically
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 21
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 21
o Interestingly, Minksy never said XOR cannot be
solved by neural networks
◦ Only that XOR cannot be solved with 1 layer perceptrons
o Multi-layer perceptrons can solve XOR
◦ 9 years earlier Minsky built such a multi-layer perceptron
o However, how to train a multi-layer perceptron?
o Rosenblatt’s algorithm not applicable
◦ It expects to know the desired target
Minsky & Multi-layer perceptrons
𝑦𝑖 = {0, 1}
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 22
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 22
o Minksy never said XOR is unsolvable by multi-
layer perceptrons
o Multi-layer perceptrons can solve XOR
o Problem: how to train a multi-layer perceptron?
◦ Rosenblatt’s algorithm not applicable
◦ It expects to know the ground truth 𝑎𝑖
∗
for a variable 𝑎𝑖
◦ For the output layers we have the ground truth labels
◦ For intermediate hidden layers we don’t
Minsky & Multi-layer perceptrons
𝑎𝑖
∗
=? ? ?
𝑦𝑖 = {0, 1}
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 23
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 23
The “AI winter” despite notable successes
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 24
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 24
o What everybody thought: “If a perceptron cannot even solve XOR, why bother?
o Results not as promised (too much hype!)  no further funding  AI Winter
o Still, significant discoveries were made in this period
◦ Backpropagation  Learning algorithm for MLPs (Lecture 2)
◦ Recurrent networks  Neural Networks for infinite sequences (Lecture 5)
The first “AI winter”
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 25
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 25
o Concurrently with Backprop and Recurrent Nets, new and promising Machine
Learning models were proposed
o Kernel Machines & Graphical Models
◦ Similar accuracies with better math and proofs and fewer heuristics
◦ Neural networks could not improve beyond a few layers
The second “AI winter”
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 26
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 26
o We have invited the PyTorch developers to give a tutorial on how to use
PyTorch
o 3 slots
◦ Tuesday (today), 11-13, Turingzaal
◦ Tuesday (today), 15-17, C0.110
◦ Wednesday (today), 15-17, C0.110
o Next Friday at the practical, 11-12, presentation by SURFSara
o If you are not an MSc student and you want to follow the course and get
updates, send me an email to subscribe you
Interim Announcements
Prepare to vote
UVA DEEP LEARNING COURSE - EFSTRATIOS GAVVES & MAX WELLING
Voting is anonymous
Internet 1
2
Go to shakespeak.me
Log in with uva507
TXT 1
2
Text to 06 4250 0030
Type uva507 <space> your choice (e.g. uva507 b)
This presentation has been loaded without the Shakespeak add-in.
Want to download the add-in for free? Go to http://shakespeak.com/en/free-download/.
In this edition we will try for a more interactive course. Would you like to try
this out?
A. Yes, why not?
B. Nope!
C. Yes, under conditions.
Votes: 92
Time: 60s
The question will open when you
start your session and slideshow.
Internet This text box will be used to describe the different message sending methods.
TXT The applicable explanations will be inserted after you have started a session.
This presentation has been loaded without the Shakespeak add-in.
Want to download the add-in for free? Go to http://shakespeak.com/en/free-download/.
In this edition we will try for a more interactive course. Would you like to try
this out?
Closed
A.
B.
C.
Yes, why not?
Nope!
Yes, under conditions.
79.3%
0.0%
20.7%
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 30
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 30
The thaw of the “AI winter”
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 31
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 31
o Lack of processing power
o Lack of data
o Overfitting
o Vanishing gradients
o Experimentally, training multi-layer perceptrons was not that useful
◦ Accuracy didn’t improve with more layers
◦ Are 1-2 hidden layers the best neural networks can do?
Neural Network problems a decade ago
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 32
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 32
o Layer-by-layer training
◦ The training of each layer individually is an
easier undertaking
o Training multi-layered neural networks
became easier
o Per-layer trained parameters initialize
further training using contrastive
divergence
Deep Learning arrives
Training layer 1
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 33
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 33
o Layer-by-layer training
◦ The training of each layer individually is an
easier undertaking
o Training multi-layered neural networks
became easier
o Per-layer trained parameters initialize
further training using contrastive
divergence
Deep Learning arrives
Training layer 2
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 34
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 34
o Layer-by-layer training
◦ The training of each layer individually is an
easier undertaking
o Training multi-layered neural networks
became easier
o Per-layer trained parameters initialize
further training using contrastive
divergence
Deep Learning arrives
Training layer 3
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 35
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 35
Deep Learning Renaissance
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 36
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 36
Alexnet architecture
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 37
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 37
o In 2009 the Imagenet dataset was published [Deng et al., 2009]
◦ Collected images for each of the 100K terms in Wordnet (16M images in total)
◦ Terms organized hierarchically: “Vehicle”“Ambulance”
o Imagenet Large Scale Visual Recognition Challenge (ILSVRC)
◦ 1 million images
◦ 1,000 classes
◦ Top-5 and top-1 error measured
Deep Learning is Big Data Hungry!
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 38
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 38
Why now?
Perceptron
Backpropagation
OCR with CNN
???
Object recognition with CNN
Imagenet: 1,000 classes
from real images,
1,000,000 images
Datasets of everything (captions, question-
answering, …), reinforcement learning, ???
Bank cheques
Parity, negation problems
Mark
I
Perceptron
Potentiometers
implement perceptron
weights
1. Better hardware
2. Bigger data
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 39
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 39
Deep Learning Golden Era
UVA DEEP LEARNING COURSE
EFSTRATIOS GAVVES
INTRODUCTION TO DEEP LEARNING - 40
Deep Learning:
The What and Why
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 41
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 41
o A family of parametric, non-linear and hierarchical representation learning
functions, which are massively optimized with stochastic gradient descent
to encode domain knowledge, i.e. domain invariances, stationarity.
o 𝑎𝐿 𝑥; 𝜃1,…,L = ℎ𝐿 (ℎ𝐿−1 … ℎ1 𝑥, θ1 , θ𝐿−1 , θ𝐿)
◦ 𝑥:input, θ𝑙: parameters for layer l, 𝑎𝑙 = ℎ𝑙(𝑥, θ𝑙): (non-)linear function
o Given training corpus {𝑋, 𝑌} find optimal parameters
θ∗
← arg min𝜃 ෍
(𝑥,𝑦)⊆(𝑋,𝑌)
ℓ(𝑦, 𝑎𝐿 𝑥; 𝜃1,…,L )
Long story short
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 42
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 42
o Traditional pattern recognition
o End-to-end learning  Features are also learned from data
Learning Representations & Features
Hand-crafted
Feature Extractor
Separate Trainable
Classifier
“Lemur”
Trainable
Feature Extractor
Trainable Classifier “Lemur”
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 43
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 43
o 𝑋 = 𝑥1, 𝑥2, … , 𝑥𝑛 ∈ ℛ𝑑
o Given the 𝑛 points there are in
total 2𝑛
dichotomies
o Only about 𝑑 are linearly
separable
o With 𝑛 > 𝑑 the probability 𝑋 is
linearly separable converges to 0
very fast
o The chances that a dichotomy is
linearly separable is very small
Non-separability of linear machines
Probability
of
linear
separability
#samples
P=N
How can we solve the non-separability of linear machines?
A. Apply SVM
B. Use non-linear features
C. Use non-linear kernels
D. Use advanced optimizers, like Adam or Nesterov's Momentum
Votes: 82
Time: 60s
The question will open when you
start your session and slideshow.
Internet This text box will be used to describe the different message sending methods.
TXT The applicable explanations will be inserted after you have started a session.
This presentation has been loaded without the Shakespeak add-in.
Want to download the add-in for free? Go to http://shakespeak.com/en/free-download/.
How can we solve the non-separability of linear machines?
Closed
A.
B.
C.
D.
Apply SVM
Use non-linear features
Use non-linear kernels
Use advanced optimizers, like Adam or Nesterov's Momentum
6.1%
24.4%
69.5%
0.0%
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 46
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 46
o Most data distributions and tasks are non-linear
o A linear assumption is often convenient, but not necessarily truthful
o Problem: How to get non-linear machines without too much effort?
Non-linearizing linear machines
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 47
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 47
o Most data distributions and tasks are non-linear
o A linear assumption is often convenient, but not necessarily truthful
o Problem: How to get non-linear machines without too much effort?
o Solution: Make features non-linear
o What is a good non-linear feature?
◦ Non-linear kernels, e.g., polynomial, RBF, etc
◦ Explicit design of features (SIFT, HOG)?
Non-linearizing linear machines
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 48
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 48
o Invariant … but not too invariant
o Repeatable … but not bursty
o Discriminative … but not too class-specific
o Robust … but sensitive enough
Good features
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 49
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 49
o Raw data live in huge dimensionalities
o But, effectively lie in lower dimensional manifolds
o Can we discover this manifold to embed our data on?
Manifolds
Dimension
1
Dimension 2
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 50
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 50
o Goal: discover these lower dimensional manifolds
◦ These manifolds are most probably highly non-linear
o First hypothesis: Semantically similar things lie closer together than
semantically dissimilar things
o Second hypothesis: A face (or any other image) is a point on the manifold
 Compute the coordinates of this point and use them as a feature
 Face features will be separable
How to get good features?
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 51
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 51
o There are good features (manifolds) and bad features
o 28 pixels x 28 pixels = 784 dimensions
The digits manifolds
PCA manifold
(Two eigenvectors)
t-SNE manifold
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 52
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 52
o A pipeline of successive, differentiable modules
◦ Each module’s output is the input for the next module
o Each subsequent module produce higher abstraction features
o Preferably, input as raw as possible
End-to-end learning of feature hierarchies
Initial
modules
“Lemur”
Middle
modules
Last
modules
Why learn the features and not just design them?
A. Designing features manually is too time consuming and requires expert knowledge
B. Learned features give us a better understanding of the data
C. Learned features are more compact and specific for the task at hand
D. Learned features are easy to adapt
E. Features can be learnt in a plug-n-play fashion, ease for the layman
Votes: 81
Time: 60s
The question will open when you
start your session and slideshow.
Internet This text box will be used to describe the different message sending methods.
TXT The applicable explanations will be inserted after you have started a session.
This presentation has been loaded without the Shakespeak add-in.
Want to download the add-in for free? Go to http://shakespeak.com/en/free-download/.
Why learn the features and not just design them?
Closed
A.
B.
C.
D.
E.
Designing features manually is too time consuming and requires expert
knowledge
Learned features give us a better understanding of the data
Learned features are more compact and specific for the task at hand
Learned features are easy to adapt
Features can be learnt in a plug-n-play fashion, ease for the layman
48.1%
13.6%
28.4%
8.6%
1.2%
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 55
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 55
o Manually designed features
◦ Expensive to research & validate
o Learned features
◦ If data is enough, easy to learn, compact and specific
o Time spent for designing features now spent for designing architectures
Why learn the features?
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 56
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 56
o Supervised learning, e.g. Convolutional Networks
Types of learning
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 57
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 57
Convolutional networks
Dog or Cat?
Is this a dog or a cat?
Input layer
Hidden layers
Output layers
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 58
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 58
o Supervised learning, e.g. Convolutional Networks
o Unsupervised learning, e.g. Autoencoders
Types of learning
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 59
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 59
Autoencoders
Encoding Decoding
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 60
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 60
o Supervised learning, e.g. Convolutional Networks
o Unsupervised learning, e.g. Autoencoders
o Self-supervised learning
o A mix of supervised and unsupervised learning
Types of learning
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 61
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 61
o Supervised learning, e.g. Convolutional Networks
o Unsupervised learning, e.g. Autoencoders
o Self-supervised learning
o A mix of supervised and unsupervised learning
o Reinforcement learning
◦ Agent perform actions in an environment and gets rewards
Types of learning
UVA DEEP LEARNING COURSE
EFSTRATIOS GAVVES
INTRODUCTION TO DEEP LEARNING - 62
Philosophy of
the course
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 63
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 63
o We only have 2 months = 14 lectures
o Lots of material to cover
o Hence, no time to lose
◦ Basic neural networks, learning PyTorch, learning to program on a server, advanced
optimization techniques, convolutional neural networks, recurrent neural networks,
generative models
o This course is hard
◦ But is optional
◦ From previous student evaluations, it has been very useful for everyone
The bad news 
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 64
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 64
o We are here to help
◦ Last year we got a great evaluation score, so people like it and learn from it
o We have agreed with SURF SARA to give you access to the Dutch
Supercomputer Cartesius with a bunch of (very) expensive GPUs
o You’ll get to know some of the hottest stuff in AI today
o You’ll get to present your own work to an interesting/ed crowd
The good news 
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 65
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 65
o You’ll get to know some of the hottest stuff in AI today
◦ in academia
The good news 
NIPS CVPR
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 66
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 66
o You will get to know some of the hottest stuff in AI today
◦ in academia & in industry
The good news 
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 67
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 67
o In the end of the course we might give a few MSc Thesis Projects in
collaboration with Qualcomm/QUVA Lab
◦ Students will become interns in the QUVA lab and get paid during thesis
o Requirements
◦ Work hard enough and be motivated
◦ Have top performance in the class
◦ And interested in working with us
o Come and find me later
The even better news 
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 68
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 68
o We encourage you to help each other, actively participate, give feedback
◦ 3 students with highest participation in Q&A in Piazza get +0.5 grade
◦ Your grade depends on what you do, not what others do
◦ You have plenty of chances to collaborate for your poster and paper presentation
o However, we do not tolerate blind copy
◦ Not from each other
◦ Not from the internet
◦ We use TurnitIn for plagiarism detection
Code of conduct
UVA DEEP LEARNING COURSE
EFSTRATIOS GAVVES
INTRODUCTION TO DEEP LEARNING - 69
Summary
o A brief history of Deep Learning
o Why is Deep Learning happening now?
o What types of Deep Learning exist?
Reading material
o http://www.deeplearningbook.org/
o Chapter 1: Introduction, p.1-28
Also, enroll in Deep Vision Seminars
UVA DEEP LEARNING COURSE
EFSTRATIOS GAVVES
INTRODUCTION TO DEEP LEARNING - 70
Next lecture
o Neural networks as layers and modules
o Build your own modules
o Backprop
o Stochastic Gradient Descend

lecture1-introllllllllllllllllllllllllll

  • 1.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 1 Lecture 1: Introduction to Deep Learning Efstratios Gavves
  • 2.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 2 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 2 o Machine Learning 1 o Calculus, Linear Algebra ◦ Derivatives, integrals ◦ Matrix operations ◦ Computing lower bounds, limits o Probability Theory, Statistics o Advanced programming o Time, patience & drive Prerequisites
  • 3.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 3 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 3 o Design and Program Deep Neural Networks o Advanced Optimizations (SGD, Nestorov’s Momentum, RMSprop, Adam) and Regularizations o Convolutional and Recurrent Neural Networks (feature invariance and equivariance) o Unsupervised Learning and Autoencoders o Generative models (RBMs, Variational Autoencoders, Generative Adversarial Networks) o Bayesian Neural Networks and their Applications o Advanced Temporal Modelling, Credit Assignment, Neural Network Dynamics o Biologically-inspired Neural Networks o Deep Reinforcement Learning Learning Goals
  • 4.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 4 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 4 o 3 individual practicals (PyTorch) ◦ Practical 1: Convnets and Optimizations ◦ Practical 2: Recurrent Networks ◦ Practical 3: Generative Models o 1 group presentation of an existing paper (1 group=3 persons) ◦ We’ll provide a list of papers or choose another paper (your own?) ◦ By next Monday make your team: we will prepare a Google Spreadsheet Practicals
  • 5.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 5 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 5 Grading Total Grade 100% Final Exam 50% Total practicals 50% Practical 1 15% Practical 2 15% Practical 3 15% Poster 5% +0.5 Bonus Piazza Grade
  • 6.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 6 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 6 o Course: Theory (4 hours per week) + Labs (4 hours per week) ◦ All material on http://uvadlc.github.io ◦ Book: Deep Learning by I. Goodfellow, Y. Bengio, A. Courville (available online) o Live interactions via Piazza. Please, subscribe today! ◦ Link: https://piazza.com/university_of_amsterdam/fall2018/uvadlc/home o Practicals are individual! ◦ More than encouraged to cooperate but not copy The top 3 Piazza contributors get +0.5 grade ◦ Plagiarism checks on reports and code  Do not cheat! Overview
  • 7.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 7 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 7 o Efstratios Gavves ◦ Assistant Professor, QUVA Deep Vision Lab (C3.229) ◦ Temporal Models, Spatiotemporal Deep Learning, Video Analysis o Teaching Assistants ◦ Kirill Gavrilyuk, Berkay Kicanaoglu, Tom Runia, Jorn Peters, Maurice Weiler Who we are and how to reach us Me :P Kirill Berkay Jorn Tom Maurice @egavves Efstratios Gavves
  • 8.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 8 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 8 o Applications of Deep Learning in Vision, Robotics, Game AI, NLP o A brief history of Neural Networks and Deep Learning o Neural Networks as modular functions Lecture Overview
  • 9.
    UVA DEEP LEARNINGCOURSE EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 9 Applications of Deep Learning
  • 10.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 10 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 10 Deep Learning in practice YouTube Youtube Website Youtube Youtube
  • 11.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 11 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 11 o Vision is ultra challenging! ◦ For 256x256 resolution  2524,288 of possible images (1024 stars in the universe) ◦ Large visual object variations (viewpoints, scales, deformations, occlusions) ◦ Large semantic object variations o Robotics is typically considered in controlled environments o Game AI involves extreme number of possible games states (101048 possible GO games) o NLP is extremely high dimensional and vague (just for English: 150K words) Why should we be impressed? Inter-class variation Intra-class overlap
  • 12.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 12 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 12 Deep Learning even for the arts
  • 13.
    UVA DEEP LEARNINGCOURSE EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 13 A brief history of Neural Networks & Deep Learning Frank Rosenblatt Charles W. Wightman
  • 14.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 14 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 14 First appearance (roughly)
  • 15.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 15 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 15 o Rosenblatt proposed Perceptrons for binary classifications ◦ One weight 𝑤𝑖 per input 𝑥𝑖 ◦ Multiply weights with respective inputs and add bias 𝑥0 =+1 ◦ If result larger than threshold return 1, otherwise 0 Perceptrons
  • 16.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 16 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 16 o Rosenblatt’s innovation was mainly the learning algorithm for perceptrons o Learning algorithm ◦ Initialize weights randomly ◦ Take one sample 𝑥𝑖and predict 𝑦𝑖 ◦ For erroneous predictions update weights ◦ If prediction ෝ 𝑦𝑖 = 0 and ground truth 𝑦𝑖 = 1, increase weights ◦ If prediction ෝ 𝑦𝑖 = 1 and ground truth 𝑦𝑖 = 0, decrease weights ◦ Repeat until no errors are made Training a perceptron
  • 17.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 17 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 17 o 1 perceptron == 1 decision o What about multiple decisions? ◦ E.g. digit classification o Stack as many outputs as the possible outcomes into a layer ◦ Neural network o Use one layer as input to the next layer ◦ Add nonlinearities between layers ◦ Multi-layer perceptron (MLP) From a single layer to multiple layers 1-layer neural network Multi-layer perceptron
  • 18.
    What could bea problem with perceptrons? A. They can only return one output, so only work for binary problems B. They are linear machines, so can only solve linear problems C. They can only work for vector inputs D. They are too complex to train, so they can work with big computers only Votes: 0 Time: 60s The question will open when you start your session and slideshow. Internet This text box will be used to describe the different message sending methods. TXT The applicable explanations will be inserted after you have started a session. This presentation has been loaded without the Shakespeak add-in. Want to download the add-in for free? Go to http://shakespeak.com/en/free-download/.
  • 19.
    What could bea problem with perceptrons? Closed A. B. C. D. They can only return one output, so only work for binary problems They are linear machines, so can only solve linear problems They can only work for vector inputs They are too complex to train, so they can work with big computers only 25.0% 50.0% 75.0% 100.0% We will set these example results to zero once you've started your session and your slide show. In the meantime, feel free to change the looks of your results (e.g. the colors).
  • 20.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 20 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 20 o However, the exclusive or (XOR) cannot be solved by perceptrons ◦ [Minsky and Papert, “Perceptrons”, 1969] ◦ 0 𝑤1 + 0𝑤2 < 𝜃 → 0 < 𝜃 ◦ 0 𝑤1 + 1𝑤2 > 𝜃 → 𝑤2 > 𝜃 ◦ 1 𝑤1 + 0𝑤2 > 𝜃 → 𝑤1 > 𝜃 ◦ 1 𝑤1 + 1𝑤2 < 𝜃 → 𝑤1 + 𝑤2 < 𝜃 XOR & Single-layer Perceptrons Input 1 Input 2 Output 1 1 0 1 0 1 0 1 1 0 0 0 Input 1 Input 2 Output 𝑤1 𝑤2 Inconsistent!! The classification boundary to solve XOR is not a line!! Graphically
  • 21.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 21 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 21 o Interestingly, Minksy never said XOR cannot be solved by neural networks ◦ Only that XOR cannot be solved with 1 layer perceptrons o Multi-layer perceptrons can solve XOR ◦ 9 years earlier Minsky built such a multi-layer perceptron o However, how to train a multi-layer perceptron? o Rosenblatt’s algorithm not applicable ◦ It expects to know the desired target Minsky & Multi-layer perceptrons 𝑦𝑖 = {0, 1}
  • 22.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 22 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 22 o Minksy never said XOR is unsolvable by multi- layer perceptrons o Multi-layer perceptrons can solve XOR o Problem: how to train a multi-layer perceptron? ◦ Rosenblatt’s algorithm not applicable ◦ It expects to know the ground truth 𝑎𝑖 ∗ for a variable 𝑎𝑖 ◦ For the output layers we have the ground truth labels ◦ For intermediate hidden layers we don’t Minsky & Multi-layer perceptrons 𝑎𝑖 ∗ =? ? ? 𝑦𝑖 = {0, 1}
  • 23.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 23 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 23 The “AI winter” despite notable successes
  • 24.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 24 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 24 o What everybody thought: “If a perceptron cannot even solve XOR, why bother? o Results not as promised (too much hype!)  no further funding  AI Winter o Still, significant discoveries were made in this period ◦ Backpropagation  Learning algorithm for MLPs (Lecture 2) ◦ Recurrent networks  Neural Networks for infinite sequences (Lecture 5) The first “AI winter”
  • 25.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 25 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 25 o Concurrently with Backprop and Recurrent Nets, new and promising Machine Learning models were proposed o Kernel Machines & Graphical Models ◦ Similar accuracies with better math and proofs and fewer heuristics ◦ Neural networks could not improve beyond a few layers The second “AI winter”
  • 26.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 26 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 26 o We have invited the PyTorch developers to give a tutorial on how to use PyTorch o 3 slots ◦ Tuesday (today), 11-13, Turingzaal ◦ Tuesday (today), 15-17, C0.110 ◦ Wednesday (today), 15-17, C0.110 o Next Friday at the practical, 11-12, presentation by SURFSara o If you are not an MSc student and you want to follow the course and get updates, send me an email to subscribe you Interim Announcements
  • 27.
    Prepare to vote UVADEEP LEARNING COURSE - EFSTRATIOS GAVVES & MAX WELLING Voting is anonymous Internet 1 2 Go to shakespeak.me Log in with uva507 TXT 1 2 Text to 06 4250 0030 Type uva507 <space> your choice (e.g. uva507 b) This presentation has been loaded without the Shakespeak add-in. Want to download the add-in for free? Go to http://shakespeak.com/en/free-download/.
  • 28.
    In this editionwe will try for a more interactive course. Would you like to try this out? A. Yes, why not? B. Nope! C. Yes, under conditions. Votes: 92 Time: 60s The question will open when you start your session and slideshow. Internet This text box will be used to describe the different message sending methods. TXT The applicable explanations will be inserted after you have started a session. This presentation has been loaded without the Shakespeak add-in. Want to download the add-in for free? Go to http://shakespeak.com/en/free-download/.
  • 29.
    In this editionwe will try for a more interactive course. Would you like to try this out? Closed A. B. C. Yes, why not? Nope! Yes, under conditions. 79.3% 0.0% 20.7%
  • 30.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 30 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 30 The thaw of the “AI winter”
  • 31.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 31 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 31 o Lack of processing power o Lack of data o Overfitting o Vanishing gradients o Experimentally, training multi-layer perceptrons was not that useful ◦ Accuracy didn’t improve with more layers ◦ Are 1-2 hidden layers the best neural networks can do? Neural Network problems a decade ago
  • 32.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 32 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 32 o Layer-by-layer training ◦ The training of each layer individually is an easier undertaking o Training multi-layered neural networks became easier o Per-layer trained parameters initialize further training using contrastive divergence Deep Learning arrives Training layer 1
  • 33.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 33 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 33 o Layer-by-layer training ◦ The training of each layer individually is an easier undertaking o Training multi-layered neural networks became easier o Per-layer trained parameters initialize further training using contrastive divergence Deep Learning arrives Training layer 2
  • 34.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 34 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 34 o Layer-by-layer training ◦ The training of each layer individually is an easier undertaking o Training multi-layered neural networks became easier o Per-layer trained parameters initialize further training using contrastive divergence Deep Learning arrives Training layer 3
  • 35.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 35 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 35 Deep Learning Renaissance
  • 36.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 36 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 36 Alexnet architecture
  • 37.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 37 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 37 o In 2009 the Imagenet dataset was published [Deng et al., 2009] ◦ Collected images for each of the 100K terms in Wordnet (16M images in total) ◦ Terms organized hierarchically: “Vehicle”“Ambulance” o Imagenet Large Scale Visual Recognition Challenge (ILSVRC) ◦ 1 million images ◦ 1,000 classes ◦ Top-5 and top-1 error measured Deep Learning is Big Data Hungry!
  • 38.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 38 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 38 Why now? Perceptron Backpropagation OCR with CNN ??? Object recognition with CNN Imagenet: 1,000 classes from real images, 1,000,000 images Datasets of everything (captions, question- answering, …), reinforcement learning, ??? Bank cheques Parity, negation problems Mark I Perceptron Potentiometers implement perceptron weights 1. Better hardware 2. Bigger data
  • 39.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 39 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 39 Deep Learning Golden Era
  • 40.
    UVA DEEP LEARNINGCOURSE EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 40 Deep Learning: The What and Why
  • 41.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 41 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 41 o A family of parametric, non-linear and hierarchical representation learning functions, which are massively optimized with stochastic gradient descent to encode domain knowledge, i.e. domain invariances, stationarity. o 𝑎𝐿 𝑥; 𝜃1,…,L = ℎ𝐿 (ℎ𝐿−1 … ℎ1 𝑥, θ1 , θ𝐿−1 , θ𝐿) ◦ 𝑥:input, θ𝑙: parameters for layer l, 𝑎𝑙 = ℎ𝑙(𝑥, θ𝑙): (non-)linear function o Given training corpus {𝑋, 𝑌} find optimal parameters θ∗ ← arg min𝜃 ෍ (𝑥,𝑦)⊆(𝑋,𝑌) ℓ(𝑦, 𝑎𝐿 𝑥; 𝜃1,…,L ) Long story short
  • 42.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 42 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 42 o Traditional pattern recognition o End-to-end learning  Features are also learned from data Learning Representations & Features Hand-crafted Feature Extractor Separate Trainable Classifier “Lemur” Trainable Feature Extractor Trainable Classifier “Lemur”
  • 43.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 43 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 43 o 𝑋 = 𝑥1, 𝑥2, … , 𝑥𝑛 ∈ ℛ𝑑 o Given the 𝑛 points there are in total 2𝑛 dichotomies o Only about 𝑑 are linearly separable o With 𝑛 > 𝑑 the probability 𝑋 is linearly separable converges to 0 very fast o The chances that a dichotomy is linearly separable is very small Non-separability of linear machines Probability of linear separability #samples P=N
  • 44.
    How can wesolve the non-separability of linear machines? A. Apply SVM B. Use non-linear features C. Use non-linear kernels D. Use advanced optimizers, like Adam or Nesterov's Momentum Votes: 82 Time: 60s The question will open when you start your session and slideshow. Internet This text box will be used to describe the different message sending methods. TXT The applicable explanations will be inserted after you have started a session. This presentation has been loaded without the Shakespeak add-in. Want to download the add-in for free? Go to http://shakespeak.com/en/free-download/.
  • 45.
    How can wesolve the non-separability of linear machines? Closed A. B. C. D. Apply SVM Use non-linear features Use non-linear kernels Use advanced optimizers, like Adam or Nesterov's Momentum 6.1% 24.4% 69.5% 0.0%
  • 46.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 46 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 46 o Most data distributions and tasks are non-linear o A linear assumption is often convenient, but not necessarily truthful o Problem: How to get non-linear machines without too much effort? Non-linearizing linear machines
  • 47.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 47 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 47 o Most data distributions and tasks are non-linear o A linear assumption is often convenient, but not necessarily truthful o Problem: How to get non-linear machines without too much effort? o Solution: Make features non-linear o What is a good non-linear feature? ◦ Non-linear kernels, e.g., polynomial, RBF, etc ◦ Explicit design of features (SIFT, HOG)? Non-linearizing linear machines
  • 48.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 48 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 48 o Invariant … but not too invariant o Repeatable … but not bursty o Discriminative … but not too class-specific o Robust … but sensitive enough Good features
  • 49.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 49 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 49 o Raw data live in huge dimensionalities o But, effectively lie in lower dimensional manifolds o Can we discover this manifold to embed our data on? Manifolds Dimension 1 Dimension 2
  • 50.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 50 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 50 o Goal: discover these lower dimensional manifolds ◦ These manifolds are most probably highly non-linear o First hypothesis: Semantically similar things lie closer together than semantically dissimilar things o Second hypothesis: A face (or any other image) is a point on the manifold  Compute the coordinates of this point and use them as a feature  Face features will be separable How to get good features?
  • 51.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 51 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 51 o There are good features (manifolds) and bad features o 28 pixels x 28 pixels = 784 dimensions The digits manifolds PCA manifold (Two eigenvectors) t-SNE manifold
  • 52.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 52 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 52 o A pipeline of successive, differentiable modules ◦ Each module’s output is the input for the next module o Each subsequent module produce higher abstraction features o Preferably, input as raw as possible End-to-end learning of feature hierarchies Initial modules “Lemur” Middle modules Last modules
  • 53.
    Why learn thefeatures and not just design them? A. Designing features manually is too time consuming and requires expert knowledge B. Learned features give us a better understanding of the data C. Learned features are more compact and specific for the task at hand D. Learned features are easy to adapt E. Features can be learnt in a plug-n-play fashion, ease for the layman Votes: 81 Time: 60s The question will open when you start your session and slideshow. Internet This text box will be used to describe the different message sending methods. TXT The applicable explanations will be inserted after you have started a session. This presentation has been loaded without the Shakespeak add-in. Want to download the add-in for free? Go to http://shakespeak.com/en/free-download/.
  • 54.
    Why learn thefeatures and not just design them? Closed A. B. C. D. E. Designing features manually is too time consuming and requires expert knowledge Learned features give us a better understanding of the data Learned features are more compact and specific for the task at hand Learned features are easy to adapt Features can be learnt in a plug-n-play fashion, ease for the layman 48.1% 13.6% 28.4% 8.6% 1.2%
  • 55.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 55 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 55 o Manually designed features ◦ Expensive to research & validate o Learned features ◦ If data is enough, easy to learn, compact and specific o Time spent for designing features now spent for designing architectures Why learn the features?
  • 56.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 56 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 56 o Supervised learning, e.g. Convolutional Networks Types of learning
  • 57.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 57 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 57 Convolutional networks Dog or Cat? Is this a dog or a cat? Input layer Hidden layers Output layers
  • 58.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 58 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 58 o Supervised learning, e.g. Convolutional Networks o Unsupervised learning, e.g. Autoencoders Types of learning
  • 59.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 59 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 59 Autoencoders Encoding Decoding
  • 60.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 60 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 60 o Supervised learning, e.g. Convolutional Networks o Unsupervised learning, e.g. Autoencoders o Self-supervised learning o A mix of supervised and unsupervised learning Types of learning
  • 61.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 61 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 61 o Supervised learning, e.g. Convolutional Networks o Unsupervised learning, e.g. Autoencoders o Self-supervised learning o A mix of supervised and unsupervised learning o Reinforcement learning ◦ Agent perform actions in an environment and gets rewards Types of learning
  • 62.
    UVA DEEP LEARNINGCOURSE EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 62 Philosophy of the course
  • 63.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 63 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 63 o We only have 2 months = 14 lectures o Lots of material to cover o Hence, no time to lose ◦ Basic neural networks, learning PyTorch, learning to program on a server, advanced optimization techniques, convolutional neural networks, recurrent neural networks, generative models o This course is hard ◦ But is optional ◦ From previous student evaluations, it has been very useful for everyone The bad news 
  • 64.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 64 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 64 o We are here to help ◦ Last year we got a great evaluation score, so people like it and learn from it o We have agreed with SURF SARA to give you access to the Dutch Supercomputer Cartesius with a bunch of (very) expensive GPUs o You’ll get to know some of the hottest stuff in AI today o You’ll get to present your own work to an interesting/ed crowd The good news 
  • 65.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 65 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 65 o You’ll get to know some of the hottest stuff in AI today ◦ in academia The good news  NIPS CVPR
  • 66.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 66 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 66 o You will get to know some of the hottest stuff in AI today ◦ in academia & in industry The good news 
  • 67.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 67 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 67 o In the end of the course we might give a few MSc Thesis Projects in collaboration with Qualcomm/QUVA Lab ◦ Students will become interns in the QUVA lab and get paid during thesis o Requirements ◦ Work hard enough and be motivated ◦ Have top performance in the class ◦ And interested in working with us o Come and find me later The even better news 
  • 68.
    UVA DEEP LEARNINGCOURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 68 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 68 o We encourage you to help each other, actively participate, give feedback ◦ 3 students with highest participation in Q&A in Piazza get +0.5 grade ◦ Your grade depends on what you do, not what others do ◦ You have plenty of chances to collaborate for your poster and paper presentation o However, we do not tolerate blind copy ◦ Not from each other ◦ Not from the internet ◦ We use TurnitIn for plagiarism detection Code of conduct
  • 69.
    UVA DEEP LEARNINGCOURSE EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 69 Summary o A brief history of Deep Learning o Why is Deep Learning happening now? o What types of Deep Learning exist? Reading material o http://www.deeplearningbook.org/ o Chapter 1: Introduction, p.1-28 Also, enroll in Deep Vision Seminars
  • 70.
    UVA DEEP LEARNINGCOURSE EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - 70 Next lecture o Neural networks as layers and modules o Build your own modules o Backprop o Stochastic Gradient Descend