Data Science, Machine Learning and Neural Networks

DATA SCIENCE,
MACHINE LEARNING,
NEURAL NETWORKS
Maxim Orlovsky, PhD, MD
CloudBusinessCity, Mentor (cloudbusinesscity.com)
GRPIIIQ, CEO (qoderoom.com, banqsystems.com)
BICA Labs, Head (bicalabs.org)

INTRODUCTION
#CloudBusinessCity #MSRoadShowDataScience

Computer
ScienceData
Science
Machine
Learning
Cognitive
Science
Artificial Intelligence

BIG DATA
• Volume
• Velocity
• Variety
• Variability
• Veracity
• analysis
• capture
• data curation
• search
• sharing
• storage
• transfer
• visualization
• querying
• updating
• information privacy

DATA MINING
computational process of
discovering patterns in large data
sets involving methods at the
intersection of artificial intelligence,
machine learning, statistics, and
database systems
Pre-Data Science Buzzword J

DATA SCIENCE
• Part of Computer Science
• Interdisciplinary field
• Data -> Knowledge
• Predictive analytics

CLOUD
COMPUTING
shared computer processing resources
and data to computers and other
devices on demand
Cloud computing reduces cost of
Data Science research and lowers
entering threshold for startups

Computer
ScienceData
Science
Machine
Learning
Cognitive
Science

“
”
MACHINE LEARNING
GIVES COMPUTERS THE ABILITY TO LEARN WITHOUT
BEING EXPLICITLY PROGRAMMED
Arthur Samuel, 1959

THE DIFFERENCE BETWEEN
ML AND PROGRAMMING
Programming Machine Learning
Result of program Deterministic Non-deterministic
Program Code Architecture
Data storage External Embedded
Changeability By human By machine

MACHINE LEARNING IS MORE
THEN AI
• Clustering
• Regression
• Dimensionality reduction
• Decision trees
• Genetic and evolutionary algorithms
Machine learning is when computer updates it’s own algorithm depending on
the data or its result

TYPES OF MACHINE LEARNING
• Supervised:
when you know what’s right and wrong (i.e. have labelled training sets)
• Non-supervised:
when you don’t know right answers/there is no labelled training sets
• Reinforced:
combination of supervised and unsupervised learning;
similar to human’s learning

K-METHODS
• k-Means Clustering: partition n
observations into k clusters
• k-Nearest Neighbors: assign class
according to the environment

GENETIC AND EVOLUTIONARY
ALGORITHMS
Classical Algorithm Genetic Algorithm
Generates a single point at each iteration. The
sequence of points approaches an optimal
solution.
Generates a population of points at each
iteration. The best point in the population
approaches an optimal solution.
Selects the next point in the sequence by a
deterministic computation.
Selects the next population by computation
which uses random number generators.

COGNITIVE
SCIENCE
examines the nature, the tasks, and the
functions of cognition
• language
• perception
• memory
• attention
• reasoning
• emotion

AI: TYPES
• Specialized:
performs only one task or subset of tasks, usually better then humans
(compare to dogs, that smell better then we do)
• Generic (human level and super-human)

BRAIN VS AI
Brain
• Massive parallelism:
100 000 000 000 “cores”
• Extreme “bandwidth”:
700 000 000 000 000 connections
between “cores”
• ~10^18 “transistors”
• Asynchronous
• Adaptive hardware: neuroplasticity
• “Analog”, but suitable for differential
and integral computations
Present day computer
• Non-parallel architecture
• Low bandwidth
• ~10^9 transistors
• Synchronous (clock rate)
• Static hardware
• Digital, but linear computations

SPECIALIZED AI
CREATES MORE RISKS
THEN GENERIC

UNDERSTANDING NEURAL
NETWORKS #1

NETWORKS
Neural network as a graph of gateways
* +w
b

NETWORKS: HERE COMES TENSORS

WEIGHTS AND BIASES: HOW DOES
THIS WORK

AI DISRUPTION 2016: KEY FACTORS
1. Machine Power and Cloud Computing
2. Big Data and its availability
3. Frameworks and ready-to-go cloud APIs

STARTUP TODO
Design a product with USP and then
1. Look for the source of data
2. Find what you can personalize
3. Use cloud computing power
4. Use ready-to-go APIs when available
5. Don’t be afraid of creating and training own neural nets
6. Always use a proper ready-to-go framework for that purpose

STARTUP TODO
Product
Data
Added
value
AI dev/
trainig

ARCHITECTURES
• Linear / recurrent
• Non-deep / deep
• Deterministic / probability
• Supervised / unsupervised / reinforced

APPLICATIONS
• Computer vision
• NLP
• Translation
• Text-to-speech and vice verse
• Generative methods
• Personalization and adaptive methods
• Complex solutions implementing different types of AI to obtain a cohesive
result

GENERATIVE METHODS AND
PRODUCTS

THE NEXT
REMBRANDT
https://www.nextrembrandt.com
“We now had a digital file true to
Rembrandt’s style in content, shapes,
and lighting. But paintings aren’t just 2D
— they have a remarkable three-
dimensionality that comes from
brushstrokes and layers of paint. To
recreate this texture, we had to study
3D scans of Rembrandt’s paintings and
analyze the intricate layers on top of
the canvas.”

CONTACTS
Maxim Orlovsky
About.me profile
(all social networks):
BICA Labs
Scientific enquiries:
Qoderoom
Business enquiries:

MODERN NEURAL NETWORK
ARCHITECTURES AND HOW THEY WORK

CONVOLUTION FILTER
(LAPLASSIAN)

HOW CONVOLUTION HAPPENS
INSIDE NEURAL NETWORK

GENERATING
HOUSE NUMBERS
WITH RNN
Credits: Fei-Fei Li & Andrej Karpathy & Justin Johnson, Stanford University

LONG SHORT-TERM MEMORY (LSTM):
BEST RECURRENT ARCHITECTURE

LSTM: SIMPLIFICATION
“Memory”
New data
Previous result Output
“Forget and
remember”
Correct by
“recalling”

DESIGNING NEURAL NET WITH
YOUR DATA
1. Find a way to embed data
2. Understand what you’d like to receive from network
3. Design proper network architecture:
1. Use recurrent networks for time-based data
2. Use LSTM networks if time intervals between the data are large or non-even
3. Select number of layers according to data dimensionality
4. Has training set? Use supervised learning. Otherwise – reinforced.
4. Visualize and re-iterate hundreds of times
5. PROFIT!

FRAMEWORKS
• TensorFlow
• Teano
• Torch
• CNTK
• Caffe

CAFFE
• http://caffe.berkeleyvision.org
• From Berkley University
• Written in C++
• Create networks in Protocol Buffers: no
need to write code
• Has Python and MATLAB bindings
• Good for feedforward networks
• Good for finetuning existing networks
• Not good for recurrent networks
layer {
name: "ip1"
type: "InnerProduct"
param { lr_mult: 1 }
param { lr_mult: 2 }
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "pool2"
top: "ip1"
}

TORCH
• http://torch.ch
• From New York University
• Written in C and Lua
• Used a lot a Facebook, DeepMind
• Create networks in Lua
• You usually write your own training code
• Lots of modular pieces that are easy to combine
• Less plug-and-play than Caffe
• Easy to write your own layer types and run on
GPU
• Not good for recurrent networks

LUA
• High level scripting language,
easy to interface with C
• Similar to Javascript:
• One data structure:
table == JS object
• Prototypical inheritance
metatable == JS prototype
• First-class functions
• Downsides:
• 1-indexed – bad for tensors =(
• Variables global by default =(
• Small standard library

TEANO
• http://deeplearning.net/software/theano
• From University of Montreal
• Python + numpy
• Embracing computation graphs, symbolic
computation
• RNNs fit nicely in computational graph
• Raw Theano is somewhat low-level
• High level wrappers (Keras, Lasagne) ease the
pain
• Large models can have long compile times
• Much “fatter” than Torch; more magic

CNTK – THE MICROSOFT
COGNITIVE TOOLKIT
• https://www.cntk.ai
• From Microsoft
• Written in C++
• Programmed in Python and C++
• BrainScript: powerful abstraction
• Good for both recurrent and convolution nets

TENSORFLOW
• https://www.tensorflow.org
• From Google
• Python + numpy
• Computational graph abstraction, like Theano;
great for RNNs
• Easy visualizations (TensorBoard)
Multi-GPU and mzlti-node training
• Data AND model parallelism; best of all
frameworks
• Slower than other frameworks right now
• Much “fatter” than Torch; more magic

OVERVIEW

DATA SCIENCE IN NEURAL NETWORKS
Dimensionality reduction

T-DISTRIBUTED STOCHASTIC
NEIGHBOR EMBEDDING

DATA SCIENCE IN NEURAL NETWORKS
Inceptionism

MICROSOFT AZURE MACHINE LEARNING

EXPLORING AZURE COGNITIVE SERVICES
Demo

USING AZURE ML TEXT ANALYTICS API
Demo

USING AZURE DEEP LEARNING INSTANCES
Demo

FURTHER READING
Christopher Olah
Ex Google Brain project member
Andrej Karpathy
DeepMind, Open AI, Stanford
Stanford CS231n
Neural networks & computer vision

OTHER MATERIALS (IN RUSSIAN)
AI and our future
Интервью «Platfor.ma»
Dangers of AI
Интервью «Радио Аристократы»
AI & Blockchain
Доклад на конференции
Blockchaincof

Data Science, Machine Learning and Neural Networks

More Related Content

What's hot

Similar to Data Science, Machine Learning and Neural Networks

Recently uploaded

Data Science, Machine Learning and Neural Networks