DATA SCIENCE,
MACHINE LEARNING,
NEURAL NETWORKS
Maxim Orlovsky, PhD, MD
CloudBusinessCity, Mentor (cloudbusinesscity.com)
GRPIIIQ, CEO (qoderoom.com, banqsystems.com)
BICA Labs, Head (bicalabs.org)
INTRODUCTION
#CloudBusinessCity #MSRoadShowDataScience
Computer
ScienceData
Science
Machine
Learning
Cognitive
Science
Artificial Intelligence
BIG DATA
• Volume
• Velocity
• Variety
• Variability
• Veracity
• analysis
• capture
• data curation
• search
• sharing
• storage
• transfer
• visualization
• querying
• updating
• information privacy
DATA MINING
computational process of
discovering patterns in large data
sets involving methods at the
intersection of artificial intelligence,
machine learning, statistics, and
database systems
Pre-Data Science Buzzword J
DATA SCIENCE
• Part of Computer Science
• Interdisciplinary field
• Data -> Knowledge
• Predictive analytics
CLOUD
COMPUTING
shared computer processing resources
and data to computers and other
devices on demand
Cloud computing reduces cost of
Data Science research and lowers
entering threshold for startups
Computer
ScienceData
Science
Machine
Learning
Cognitive
Science
“
”
MACHINE LEARNING
GIVES COMPUTERS THE ABILITY TO LEARN WITHOUT
BEING EXPLICITLY PROGRAMMED
Arthur Samuel, 1959
THE DIFFERENCE BETWEEN
ML AND PROGRAMMING
Programming Machine Learning
Result of program Deterministic Non-deterministic
Program Code Architecture
Data storage External Embedded
Changeability By human By machine
MACHINE LEARNING IS MORE
THEN AI
• Clustering
• Regression
• Dimensionality reduction
• Decision trees
• Genetic and evolutionary algorithms
Machine learning is when computer updates it’s own algorithm depending on
the data or its result
TYPES OF MACHINE LEARNING
• Supervised:
when you know what’s right and wrong (i.e. have labelled training sets)
• Non-supervised:
when you don’t know right answers/there is no labelled training sets
• Reinforced:
combination of supervised and unsupervised learning;
similar to human’s learning
K-METHODS
• k-Means Clustering: partition n
observations into k clusters
• k-Nearest Neighbors: assign class
according to the environment
GENETIC AND EVOLUTIONARY
ALGORITHMS
Classical Algorithm Genetic Algorithm
Generates a single point at each iteration. The
sequence of points approaches an optimal
solution.
Generates a population of points at each
iteration. The best point in the population
approaches an optimal solution.
Selects the next point in the sequence by a
deterministic computation.
Selects the next population by computation
which uses random number generators.
Computer
ScienceData
Science
Machine
Learning
Cognitive
Science
COGNITIVE
SCIENCE
examines the nature, the tasks, and the
functions of cognition
• language
• perception
• memory
• attention
• reasoning
• emotion
Computer
ScienceData
Science
Machine
Learning
Cognitive
Science
Artificial Intelligence
AI: TYPES
• Specialized:
performs only one task or subset of tasks, usually better then humans
(compare to dogs, that smell better then we do)
• Generic (human level and super-human)
MACHINE POWER:
MOOR’S LAW
WHEN GENERIC AI WILL APPEAR?
WHEN GENERIC AI WILL APPEAR?
BRAIN VS AI
Brain
• Massive parallelism:
100 000 000 000 “cores”
• Extreme “bandwidth”:
700 000 000 000 000 connections
between “cores”
• ~10^18 “transistors”
• Asynchronous
• Adaptive hardware: neuroplasticity
• “Analog”, but suitable for differential
and integral computations
Present day computer
• Non-parallel architecture
• Low bandwidth
• ~10^9 transistors
• Synchronous (clock rate)
• Static hardware
• Digital, but linear computations
SPECIALIZED AI
CREATES MORE RISKS
THEN GENERIC
NEURAL NETWORKS
UNDERSTANDING NEURAL
NETWORKS
UNDERSTANDING NEURAL
NETWORKS #1
UNDERSTANDING NEURAL
NETWORKS
Neural network as a graph of gateways
* +w
b
UNDERSTANDING NEURAL
NETWORKS: HERE COMES TENSORS
WEIGHTS AND BIASES: HOW DOES
THIS WORK
HOW NN CLASSIFIES
AI DISRUPTION 2016: KEY FACTORS
1. Machine Power and Cloud Computing
2. Big Data and its availability
3. Frameworks and ready-to-go cloud APIs
STARTUP TODO
Design a product with USP and then
1. Look for the source of data
2. Find what you can personalize
3. Use cloud computing power
4. Use ready-to-go APIs when available
5. Don’t be afraid of creating and training own neural nets
6. Always use a proper ready-to-go framework for that purpose
STARTUP TODO
Product
Data
Added
value
AI dev/
trainig
NEURAL NETWORKS OVERVIEW
ARCHITECTURES
• Linear / recurrent
• Non-deep / deep
• Deterministic / probability
• Supervised / unsupervised / reinforced
APPLICATIONS
• Computer vision
• NLP
• Translation
• Text-to-speech and vice verse
• Generative methods
• Personalization and adaptive methods
• Complex solutions implementing different types of AI to obtain a cohesive
result
GENERATIVE METHODS AND
PRODUCTS
GENERATIVE METHODS AND
PRODUCTS
THE NEXT
REMBRANDT
https://www.nextrembrandt.com
“We now had a digital file true to
Rembrandt’s style in content, shapes,
and lighting. But paintings aren’t just 2D
— they have a remarkable three-
dimensionality that comes from
brushstrokes and layers of paint. To
recreate this texture, we had to study
3D scans of Rembrandt’s paintings and
analyze the intricate layers on top of
the canvas.”
CONTACTS
Maxim Orlovsky
About.me profile
(all social networks):
BICA Labs
Scientific enquiries:
Qoderoom
Business enquiries:
CTO PART
MODERN NEURAL NETWORK
ARCHITECTURES AND HOW THEY WORK
NN MECHANICS
CONVOLUTION FILTER
(LAPLASSIAN)
HOW CONVOLUTION HAPPENS
INSIDE NEURAL NETWORK
CONVOLUTION LAYER
MICROSOFT MSRA PROJECT
GENERATING
HOUSE NUMBERS
WITH RNN
Credits: Fei-Fei Li & Andrej Karpathy & Justin Johnson, Stanford University
LONG SHORT-TERM MEMORY (LSTM):
BEST RECURRENT ARCHITECTURE
LSTM: SIMPLIFICATION
“Memory”
New data
Previous result Output
“Forget and
remember”
Correct by
“recalling”
DESIGNING NEURAL NET WITH
YOUR DATA
1. Find a way to embed data
2. Understand what you’d like to receive from network
3. Design proper network architecture:
1. Use recurrent networks for time-based data
2. Use LSTM networks if time intervals between the data are large or non-even
3. Select number of layers according to data dimensionality
4. Has training set? Use supervised learning. Otherwise – reinforced.
4. Visualize and re-iterate hundreds of times
5. PROFIT!
FRAMEWORKS
• TensorFlow
• Teano
• Torch
• CNTK
• Caffe
CAFFE
• http://caffe.berkeleyvision.org
• From Berkley University
• Written in C++
• Create networks in Protocol Buffers: no
need to write code
• Has Python and MATLAB bindings
• Good for feedforward networks
• Good for finetuning existing networks
• Not good for recurrent networks
layer {
name: "ip1"
type: "InnerProduct"
param { lr_mult: 1 }
param { lr_mult: 2 }
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "pool2"
top: "ip1"
}
Credits: Fei-Fei Li & Andrej Karpathy & Justin Johnson, Stanford University
TORCH
• http://torch.ch
• From New York University
• Written in C and Lua
• Used a lot a Facebook, DeepMind
• Create networks in Lua
• You usually write your own training code
• Lots of modular pieces that are easy to combine
• Less plug-and-play than Caffe
• Easy to write your own layer types and run on
GPU
• Not good for recurrent networks
Credits: Fei-Fei Li & Andrej Karpathy & Justin Johnson, Stanford University
LUA
• High level scripting language,
easy to interface with C
• Similar to Javascript:
• One data structure:
table == JS object
• Prototypical inheritance
metatable == JS prototype
• First-class functions
• Downsides:
• 1-indexed – bad for tensors =(
• Variables global by default =(
• Small standard library
Credits: Fei-Fei Li & Andrej Karpathy & Justin Johnson, Stanford University
TEANO
• http://deeplearning.net/software/theano
• From University of Montreal
• Python + numpy
• Embracing computation graphs, symbolic
computation
• RNNs fit nicely in computational graph
• Raw Theano is somewhat low-level
• High level wrappers (Keras, Lasagne) ease the
pain
• Large models can have long compile times
• Much “fatter” than Torch; more magic
Credits: Fei-Fei Li & Andrej Karpathy & Justin Johnson, Stanford University
CNTK – THE MICROSOFT
COGNITIVE TOOLKIT
• https://www.cntk.ai
• From Microsoft
• Written in C++
• Programmed in Python and C++
• BrainScript: powerful abstraction
• Good for both recurrent and convolution nets
TENSORFLOW
• https://www.tensorflow.org
• From Google
• Python + numpy
• Computational graph abstraction, like Theano;
great for RNNs
• Easy visualizations (TensorBoard)
Multi-GPU and mzlti-node training
• Data AND model parallelism; best of all
frameworks
• Slower than other frameworks right now
• Much “fatter” than Torch; more magic
Credits: Fei-Fei Li & Andrej Karpathy & Justin Johnson, Stanford University
OVERVIEW
Credits: Fei-Fei Li & Andrej Karpathy & Justin Johnson, Stanford University
DATA SCIENCE IN NEURAL NETWORKS
Dimensionality reduction
T-DISTRIBUTED STOCHASTIC
NEIGHBOR EMBEDDING
T-DISTRIBUTED STOCHASTIC
NEIGHBOR EMBEDDING
T-DISTRIBUTED STOCHASTIC
NEIGHBOR EMBEDDING
COMPUTER TRANSLATION
DATA SCIENCE IN NEURAL NETWORKS
Inceptionism
INCEPTIONISM
INCEPTIONISM: GENERATING
INCEPTIONISM: ENHANCING
INCEPTIONISM: ITERATIONS
MICROSOFT AZURE MACHINE LEARNING
EXPLORING AZURE COGNITIVE SERVICES
Demo
USING AZURE ML TEXT ANALYTICS API
Demo
USING AZURE DEEP LEARNING INSTANCES
Demo
FURTHER READING
Christopher Olah
Ex Google Brain project member
Andrej Karpathy
DeepMind, Open AI, Stanford
Stanford CS231n
Neural networks & computer vision
OTHER MATERIALS (IN RUSSIAN)
AI and our future
Интервью «Platfor.ma»
Dangers of AI
Интервью «Радио Аристократы»
AI & Blockchain
Доклад на конференции
Blockchaincof

Data Science, Machine Learning and Neural Networks