Machine Learning
Outline
Introduction to Machine Learning
Supervised vs Unsupervised Learning
Machine Learning Pipeline
Machine Learning Algorithms
Introduction to Machine Learning
How to achieve AI?
Working of Rule based systems
An example Rule based system
Limitations of Rules-Based Systems
In other words: Why use machine learning?
 It takes lot of time to list out all the rules
 Changing rules is tedious
 As the list of rules grows, it becomes too difficult to manage and has
a lot of redundancies.
 The person who wrote the initial rules for you leaves, and you have
to spend time and resources to catch up on the long list of rules.
Introduction to Machine Learning
*Machine Learning (ML) is the scientific study of algorithms and statistical
models that computer systems use in order to perform a specific task by
relying on patterns and inference instead of explicit instructions.
*ML algorithms are used in a wide variety of applications, such as
computer vision (CV), data mining, natural language processing (NLP),
etc.
Machine Learning (ML)
 ML refers to the ability of computers to learn without being
explicitly Programmed.
Traditional Programming vs. Machine Learning
Machine Learning (ML)
Most common tool for Data analytics
Data is the Fuel
Use features in the data and to create a predictive model
Machine Learning Applications
https://youtu.be/tF4DML7FIWk
ML Pipeline
ML Pipeline: Sequential View
Types of Learning
Types of Learning
Classification
• Classification:
• To predict categorical responses
• Ex: Face Recognition, Digit Recognition
•Matlab - Classification Learner App
Regression
• Regression:
• To predict continuous responses
• Examples:
• Temparature prediction,
• Stock price prediction
Classification vs Regression vs Clustering
• Clustering:
• To divide given objects into groups of similar nature
• Ex: Given images of animals, group them based on type
Classification vs Regression vs Clustering
Task Learning Type Output Type Example
Classification Supervised Discrete class labels
Predicting whether an email is spam or not
Predicting the type of flower based on its
features
Predicting the price of a house based on its
Regression Supervised Continuous values features
Predicting the temperature for tomorrow
Clustering Unsupervised Grouping of data Grouping customers based on purchasing
Identifying distinct groups in gene expression
data behaviour
Standard Supervised Learning algorithms
• Classification Algorithms:
Logistic Regression
K-Nearest Neighbors
Decision Trees
Naïve Baye’s Classifier
Support Vector Machines
Neural Network Models
• Regression Algorithms:
Linear Regression
Polynomial Regression
Decision tree Regression
Support Vector Regression
Neural Network Models
Standard Clustering algorithms
Partitioning based Clustering:
K-Means, K-medoids
Density based Clustering:
DBSCAN
Hierarchical Clustering:
Agglomerative, Divisive
Graph Clustering:
Spectral Clustering
Machine Learning Algorithms
Supervised Learning Workflow
Supervised Learning Workflow
Supervised Learning Workflow
When Should You Use Machine Learning?
 To solve a complex task or problem involving:
 A large amount of data and lots of variables,
 No existing formula or equation
 Machine learning is a good option:
 Hand-written rules and equations are too complex. Ex:
Face recognition and speech recognition
⮚ The rules of a task are constantly changing. Ex: Fraud
detection from transaction records
Popular Machine Learning Algorithms
Machine Learning Algorithms
Linear Regression
Models relationship between the
dependent variable and independent
variables using a linear equation.
Can be used to predicts continuous
numeric values.
Finds the best-fit regression line
Linear regression
Logistic Regression
Popular classification algorithm
Linear Classifier
Probabilistic approach
K-Nearest Neighbourhood Approach
Suitable for both Classification and Regression
Simple non-parametric approach
Doe not involve any training
Identifies K Nearest Neighbors (NN)
Majority voting for classification
Average of the NN for Regression
Decision Trees
Used for both classification and regression tasks.
A tree-like structure is generated based on training data
Each internal node represents a feature or attribute
Each branch represents a decision rule
Each leaf represents predicted value.
Decision Trees recursively partition data on feature values to make predictions.
Easy to interpret and visualize
Can handle both numerical and categorical features
Decision Trees
Support Vector Machines
Can be used for classification and regression
Support Vector Classification (SVC):
⮚Used for Binary Classification task
⮚Logistic regression finds a possible separating
hyper plane
⮚SVM Finds optimal Hyperplane
⮚Handles linearly separable as well as non-
linearly separable data by using kernel
functions
Random Forest Algorithm
 Random forests is an supervised learning algorithm.
 Random forests is an ensemble learning Technique.
 Ensemble techniques: combination of multiple models is known as
Ensemble.
• Bagging
• Boosting
Random forest Algorithm Working
ML notes from janvi to study ml in easy way

ML notes from janvi to study ml in easy way

  • 1.
  • 2.
    Outline Introduction to MachineLearning Supervised vs Unsupervised Learning Machine Learning Pipeline Machine Learning Algorithms
  • 3.
  • 4.
  • 5.
    Working of Rulebased systems
  • 6.
    An example Rulebased system
  • 7.
    Limitations of Rules-BasedSystems In other words: Why use machine learning?  It takes lot of time to list out all the rules  Changing rules is tedious  As the list of rules grows, it becomes too difficult to manage and has a lot of redundancies.  The person who wrote the initial rules for you leaves, and you have to spend time and resources to catch up on the long list of rules.
  • 8.
    Introduction to MachineLearning *Machine Learning (ML) is the scientific study of algorithms and statistical models that computer systems use in order to perform a specific task by relying on patterns and inference instead of explicit instructions. *ML algorithms are used in a wide variety of applications, such as computer vision (CV), data mining, natural language processing (NLP), etc.
  • 9.
    Machine Learning (ML) ML refers to the ability of computers to learn without being explicitly Programmed.
  • 10.
  • 11.
    Machine Learning (ML) Mostcommon tool for Data analytics Data is the Fuel Use features in the data and to create a predictive model
  • 12.
  • 13.
  • 14.
  • 15.
  • 17.
  • 18.
  • 19.
    Classification • Classification: • Topredict categorical responses • Ex: Face Recognition, Digit Recognition •Matlab - Classification Learner App
  • 20.
    Regression • Regression: • Topredict continuous responses • Examples: • Temparature prediction, • Stock price prediction
  • 21.
    Classification vs Regressionvs Clustering • Clustering: • To divide given objects into groups of similar nature • Ex: Given images of animals, group them based on type
  • 22.
    Classification vs Regressionvs Clustering Task Learning Type Output Type Example Classification Supervised Discrete class labels Predicting whether an email is spam or not Predicting the type of flower based on its features Predicting the price of a house based on its Regression Supervised Continuous values features Predicting the temperature for tomorrow Clustering Unsupervised Grouping of data Grouping customers based on purchasing Identifying distinct groups in gene expression data behaviour
  • 23.
    Standard Supervised Learningalgorithms • Classification Algorithms: Logistic Regression K-Nearest Neighbors Decision Trees Naïve Baye’s Classifier Support Vector Machines Neural Network Models • Regression Algorithms: Linear Regression Polynomial Regression Decision tree Regression Support Vector Regression Neural Network Models
  • 24.
    Standard Clustering algorithms Partitioningbased Clustering: K-Means, K-medoids Density based Clustering: DBSCAN Hierarchical Clustering: Agglomerative, Divisive Graph Clustering: Spectral Clustering
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
    When Should YouUse Machine Learning?  To solve a complex task or problem involving:  A large amount of data and lots of variables,  No existing formula or equation  Machine learning is a good option:  Hand-written rules and equations are too complex. Ex: Face recognition and speech recognition ⮚ The rules of a task are constantly changing. Ex: Fraud detection from transaction records
  • 30.
  • 31.
  • 32.
    Linear Regression Models relationshipbetween the dependent variable and independent variables using a linear equation. Can be used to predicts continuous numeric values. Finds the best-fit regression line
  • 33.
  • 34.
    Logistic Regression Popular classificationalgorithm Linear Classifier Probabilistic approach
  • 35.
    K-Nearest Neighbourhood Approach Suitablefor both Classification and Regression Simple non-parametric approach Doe not involve any training Identifies K Nearest Neighbors (NN) Majority voting for classification Average of the NN for Regression
  • 36.
    Decision Trees Used forboth classification and regression tasks. A tree-like structure is generated based on training data Each internal node represents a feature or attribute Each branch represents a decision rule Each leaf represents predicted value. Decision Trees recursively partition data on feature values to make predictions. Easy to interpret and visualize Can handle both numerical and categorical features
  • 37.
  • 38.
    Support Vector Machines Canbe used for classification and regression Support Vector Classification (SVC): ⮚Used for Binary Classification task ⮚Logistic regression finds a possible separating hyper plane ⮚SVM Finds optimal Hyperplane ⮚Handles linearly separable as well as non- linearly separable data by using kernel functions
  • 39.
    Random Forest Algorithm Random forests is an supervised learning algorithm.  Random forests is an ensemble learning Technique.  Ensemble techniques: combination of multiple models is known as Ensemble. • Bagging • Boosting
  • 40.