Introduction to Machine Learning
Machine learning is a growing technology which enables computers to learn automatically from
past data. Machine learning uses various algorithms for building mathematical models and
making predictions using historical data or information. Currently, it is being used for various
tasks such as image recognition, speech recognition, email filtering, Facebook auto-
tagging, recommender system, and many more.
Machine Learning is said as a subset of artificial intelligence that is mainly concerned with the
development of algorithms which allow a computer to learn from the data and past experiences
on their own. The term machine learning was first introduced by Arthur Samuel in 1959. We can
define it in a summarized way as:
Machine learning enables a machine to automatically learn from data, improve performance
from experiences, and predict things without being explicitly programmed.
With the help of sample historical data, which is known as training data, machine learning
algorithms build a mathematical model that helps in making predictions or decisions without
being explicitly programmed. Machine learning brings computer science and statistics together
for creating predictive models. Machine learning constructs or uses the algorithms that learn
from historical data. The more we will provide the information, the higher will be the
performance.
A machine has the ability to learn if it can improve its performance by gaining more data.
How does Machine Learning work
A Machine Learning system learns from historical data, builds the prediction models, and
whenever it receives new data, predicts the output for it. The accuracy of predicted output
depends upon the amount of data, as the huge amount of data helps to build a better model which
predicts the output more accurately.
Suppose we have a complex problem, where we need to perform some predictions, so instead of
writing a code for it, we just need to feed the data to generic algorithms, and with the help of
these algorithms, machine builds the logic as per the data and predict the output. Machine
learning has changed our way of thinking about the problem. The below block diagram explains
the working of Machine Learning algorithm:
Features of Machine Learning:
o Machine learning uses data to detect various patterns in a given dataset.
o It can learn from past data and improve automatically.
o It is a data-driven technology.
o Machine learning is much similar to data mining as it also deals with the huge amount of
the data.
Need for Machine Learning
The need for machine learning is increasing day by day. The reason behind the need for machine
learning is that it is capable of doing tasks that are too complex for a person to implement
directly. As a human, we have some limitations as we cannot access the huge amount of data
manually, so for this, we need some computer systems and here comes the machine learning to
make things easy for us.
We can train machine learning algorithms by providing them the huge amount of data and let
them explore the data, construct the models, and predict the required output automatically. The
performance of the machine learning algorithm depends on the amount of data, and it can be
determined by the cost function. With the help of machine learning, we can save both time and
money.
The importance of machine learning can be easily understood by its uses cases, Currently,
machine learning is used in self-driving cars, cyber fraud detection, face recognition, and friend
suggestion by Facebook, etc. Various top companies such as Netflix and Amazon have build
machine learning models that are using a vast amount of data to analyze the user interest and
recommend product accordingly.
Following are some key points which show the importance of Machine Learning:
o Rapid increment in the production of data
o Solving complex problems, which are difficult for a human
o Decision making in various sector including finance
o Finding hidden patterns and extracting useful information from data.
ML is programming computers using data (past experience) to optimize a performance criterion.
• ML relies on: –
1-Statistics: making inferences from sample data.
2-Numerical algorithms (linear algebra, optimization): optimize criteria, manipulate models.
3-Computer science: data structures and programs that solve a ML problem efficiently.
A model: – is a compressed version of a database.
- extracts knowledge from it;
-- does not have perfect performance but is a useful approximation to the data.
Classification of Machine Learning
At a broad level, machine learning can be classified into three types:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
1) Supervised Learning
Supervised learning is a type of machine learning method in which we provide sample labeled
data to the machine learning system in order to train it, and on that basis, it predicts the output.
The system creates a model using labeled data to understand the datasets and learn about each
data, once the training and processing are done then we test the model by providing a sample
data to check whether it is predicting the exact output or not.
The goal of supervised learning is to map input data with the output data. The supervised
learning is based on supervision, and it is the same as when a student learns things in the
supervision of the teacher. The example of supervised learning is spam filtering.
How Supervised Learning Works?
In supervised learning, models are trained using labelled dataset, where the model learns about
each type of data. Once the training process is completed, the model is tested on the basis of test
data (a subset of the training set), and then it predicts the output.
The working of Supervised learning can be easily understood by the below example and
diagram:
Suppose we have a dataset of different types of shapes which includes square, rectangle, triangle,
and Polygon. Now the first step is that we need to train the model for each shape.
o If the given shape has four sides, and all the sides are equal, then it will be labelled as
a Square.
o If the given shape has three sides, then it will be labelled as a triangle.
o If the given shape has six equal sides then it will be labelled as hexagon.
Now, after training, we test our model using the test set, and the task of the model is to identify
the shape.
The machine is already trained on all types of shapes, and when it finds a new shape, it classifies
the shape on the bases of a number of sides, and predicts the output.
Steps Involved in Supervised Learning:
o First Determine the type of training dataset
o Collect/Gather the labelled training data.
o Split the training dataset into training dataset, test dataset, and validation dataset.
o Determine the input features of the training dataset, which should have enough
knowledge so that the model can accurately predict the output.
o Determine the suitable algorithm for the model, such as support vector machine, decision
tree, etc.
o Execute the algorithm on the training dataset. Sometimes we need validation sets as the
control parameters, which are the subset of training datasets.
o Evaluate the accuracy of the model by providing the test set. If the model predicts the
correct output, which means our model is accurate.
Types of supervised Machine learning Algorithms:
Supervised learning can be further divided into two types of problems:
1. Regression
Regression algorithms are used if there is a relationship between the input variable and the
output variable. It is used for the prediction of continuous variables, such as Weather forecasting,
Market Trends, etc. Below are some popular Regression algorithms which come under
supervised learning:
o Linear Regression
o Regression Trees
o Non-Linear Regression
o Bayesian Linear Regression
o Polynomial Regression
2. Classification
Classification algorithms are used when the output variable is categorical, which means there are
two classes such as Yes-No, Male-Female, True-false, etc.
Spam Filtering,
o Random Forest
o Decision Trees
o Logistic Regression
o Support vector Machines
Advantages of Supervised learning:
o With the help of supervised learning, the model can predict the output on the basis of
prior experiences.
o In supervised learning, we can have an exact idea about the classes of objects.
o Supervised learning model helps us to solve various real-world problems such as fraud
detection, spam filtering, etc.
Disadvantages of supervised learning:
o Supervised learning models are not suitable for handling the complex tasks.
o Supervised learning cannot predict the correct output if the test data is different from the
training dataset.
o Training required lots of computation times.
o In supervised learning, we need enough knowledge about the classes of object.
2) Unsupervised Learning
Unsupervised learning is a learning method in which a machine learns without any supervision.
The training is provided to the machine with the set of data that has not been labeled, classified,
or categorized, and the algorithm needs to act on that data without any supervision. The goal of
unsupervised learning is to restructure the input data into new features or a group of objects with
similar patterns.
In unsupervised learning, we don't have a predetermined result. The machine tries to find useful
insights from the huge amount of data.
Why use Unsupervised Learning?
Below are some main reasons which describe the importance of Unsupervised Learning:
o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by their own
experiences, which makes it closer to the real AI.
o Unsupervised learning works on unlabeled and uncategorized data which make
unsupervised learning more important.
o In real-world, we do not always have input data with the corresponding output so to solve
such cases, we need unsupervised learning.
Working of Unsupervised Learning
Working of unsupervised learning can be understood by the below diagram:
Here, we have taken an unlabeled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to the machine
learning model in order to train it. Firstly, it will interpret the raw data to find the hidden patterns
from the data and then will apply suitable algorithms such as k-means clustering, Decision tree,
etc.
Once it applies the suitable algorithm, the algorithm divides the data objects into groups
according to the similarities and difference between the objects.
Types of Unsupervised Learning Algorithm:
The unsupervised learning algorithm can be further categorized into two types of problems:
o Clustering: Clustering is a method of grouping the objects into clusters such that objects
with most similarities remains into a group and has less or no similarities with the objects
of another group. Cluster analysis finds the commonalities between the data objects and
categorizes them as per the presence and absence of those commonalities.
o Association: An association rule is an unsupervised learning method which is used for
finding the relationships between variables in the large database. It determines the set of
items that occurs together in the dataset. Association rule makes marketing strategy more
effective. Such as people who buy X item (suppose a bread) are also tend to purchase Y
(Butter/Jam) item. A typical example of Association rule is Market Basket Analysis.
Unsupervised Learning algorithms:
Below is the list of some popular unsupervised learning algorithms:
o K-means clustering
o KNN (k-nearest neighbors)
o Hierarchal clustering
o Anomaly detection
o Neural Networks
o Principle Component Analysis
o Independent Component Analysis
o Apriori algorithm
o Singular value decomposition
Advantages of Unsupervised Learning
o Unsupervised learning is used for more complex tasks as compared to supervised
learning because, in unsupervised learning, we don't have labeled input data.
o Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to
labeled data.
Disadvantages of Unsupervised Learning
o Unsupervised learning is intrinsically more difficult than supervised learning as it does
not have corresponding output.
o The result of the unsupervised learning algorithm might be less accurate as input data is
not labeled, and algorithms do not know the exact output in advance.
The main differences between Supervised and Unsupervised learning are given below:
Supervised Learning Unsupervised Learning
Supervised learning algorithms are trained
using labeled data.
Unsupervised learning algorithms are
trained using unlabeled data.
Supervised learning model takes direct
feedback to check if it is predicting correct
output or not.
Unsupervised learning model does not take
any feedback.
Supervised learning model predicts the
output.
Unsupervised learning model finds the
hidden patterns in data.
In supervised learning, input data is
provided to the model along with the output.
In unsupervised learning, only input data is
provided to the model.
The goal of supervised learning is to train the
model so that it can predict the output when
it is given new data.
The goal of unsupervised learning is to find
the hidden patterns and useful insights
from the unknown dataset.
Supervised learning needs supervision to
train the model.
Unsupervised learning does not need any
supervision to train the model.
Supervised learning can be categorized
in Classification and Regression problems.
Unsupervised Learning can be classified
in Clustering and Associations problems.
Supervised learning can be used for those
cases where we know the input as well as
corresponding outputs.
Unsupervised learning can be used for
those cases where we have only input data
and no corresponding output data.
Supervised learning model produces an
accurate result.
Unsupervised learning model may give less
accurate result as compared to supervised
learning.
Supervised learning is not close to true
Artificial intelligence as in this, we first train
the model for each data, and then only it can
predict the correct output.
Unsupervised learning is more close to the
true Artificial Intelligence as it learns
similarly as a child learns daily routine
things by his experiences.
It includes various algorithms such as Linear
Regression, Logistic Regression, Support
Vector Machine, Multi-class Classification,
Decision tree, Bayesian Logic, etc.
It includes various algorithms such as
Clustering, KNN, and Apriori algorithm.
3) Reinforcement Learning
Reinforcement learning is a feedback-based learning method, in which a learning agent gets a
reward for each right action and gets a penalty for each wrong action. The agent learns
automatically with these feedbacks and improves its performance. In reinforcement learning, the
agent interacts with the environment and explores it. The goal of an agent is to get the most
reward points, and hence, it improves its performance.
The robotic dog, which automatically learns the movement of his arms, is an example of
Reinforcement learning.

Introduction to Machine Learning for btech 7th sem

  • 1.
    Introduction to MachineLearning Machine learning is a growing technology which enables computers to learn automatically from past data. Machine learning uses various algorithms for building mathematical models and making predictions using historical data or information. Currently, it is being used for various tasks such as image recognition, speech recognition, email filtering, Facebook auto- tagging, recommender system, and many more. Machine Learning is said as a subset of artificial intelligence that is mainly concerned with the development of algorithms which allow a computer to learn from the data and past experiences on their own. The term machine learning was first introduced by Arthur Samuel in 1959. We can define it in a summarized way as: Machine learning enables a machine to automatically learn from data, improve performance from experiences, and predict things without being explicitly programmed. With the help of sample historical data, which is known as training data, machine learning algorithms build a mathematical model that helps in making predictions or decisions without being explicitly programmed. Machine learning brings computer science and statistics together for creating predictive models. Machine learning constructs or uses the algorithms that learn from historical data. The more we will provide the information, the higher will be the performance. A machine has the ability to learn if it can improve its performance by gaining more data. How does Machine Learning work A Machine Learning system learns from historical data, builds the prediction models, and whenever it receives new data, predicts the output for it. The accuracy of predicted output depends upon the amount of data, as the huge amount of data helps to build a better model which predicts the output more accurately. Suppose we have a complex problem, where we need to perform some predictions, so instead of writing a code for it, we just need to feed the data to generic algorithms, and with the help of these algorithms, machine builds the logic as per the data and predict the output. Machine learning has changed our way of thinking about the problem. The below block diagram explains the working of Machine Learning algorithm:
  • 2.
    Features of MachineLearning: o Machine learning uses data to detect various patterns in a given dataset. o It can learn from past data and improve automatically. o It is a data-driven technology. o Machine learning is much similar to data mining as it also deals with the huge amount of the data. Need for Machine Learning The need for machine learning is increasing day by day. The reason behind the need for machine learning is that it is capable of doing tasks that are too complex for a person to implement directly. As a human, we have some limitations as we cannot access the huge amount of data manually, so for this, we need some computer systems and here comes the machine learning to make things easy for us. We can train machine learning algorithms by providing them the huge amount of data and let them explore the data, construct the models, and predict the required output automatically. The performance of the machine learning algorithm depends on the amount of data, and it can be determined by the cost function. With the help of machine learning, we can save both time and money. The importance of machine learning can be easily understood by its uses cases, Currently, machine learning is used in self-driving cars, cyber fraud detection, face recognition, and friend suggestion by Facebook, etc. Various top companies such as Netflix and Amazon have build machine learning models that are using a vast amount of data to analyze the user interest and recommend product accordingly. Following are some key points which show the importance of Machine Learning: o Rapid increment in the production of data o Solving complex problems, which are difficult for a human
  • 3.
    o Decision makingin various sector including finance o Finding hidden patterns and extracting useful information from data. ML is programming computers using data (past experience) to optimize a performance criterion. • ML relies on: – 1-Statistics: making inferences from sample data. 2-Numerical algorithms (linear algebra, optimization): optimize criteria, manipulate models. 3-Computer science: data structures and programs that solve a ML problem efficiently. A model: – is a compressed version of a database. - extracts knowledge from it; -- does not have perfect performance but is a useful approximation to the data. Classification of Machine Learning At a broad level, machine learning can be classified into three types: 1. Supervised learning 2. Unsupervised learning 3. Reinforcement learning 1) Supervised Learning Supervised learning is a type of machine learning method in which we provide sample labeled data to the machine learning system in order to train it, and on that basis, it predicts the output. The system creates a model using labeled data to understand the datasets and learn about each data, once the training and processing are done then we test the model by providing a sample data to check whether it is predicting the exact output or not. The goal of supervised learning is to map input data with the output data. The supervised learning is based on supervision, and it is the same as when a student learns things in the supervision of the teacher. The example of supervised learning is spam filtering. How Supervised Learning Works?
  • 4.
    In supervised learning,models are trained using labelled dataset, where the model learns about each type of data. Once the training process is completed, the model is tested on the basis of test data (a subset of the training set), and then it predicts the output. The working of Supervised learning can be easily understood by the below example and diagram: Suppose we have a dataset of different types of shapes which includes square, rectangle, triangle, and Polygon. Now the first step is that we need to train the model for each shape. o If the given shape has four sides, and all the sides are equal, then it will be labelled as a Square. o If the given shape has three sides, then it will be labelled as a triangle. o If the given shape has six equal sides then it will be labelled as hexagon. Now, after training, we test our model using the test set, and the task of the model is to identify the shape. The machine is already trained on all types of shapes, and when it finds a new shape, it classifies the shape on the bases of a number of sides, and predicts the output. Steps Involved in Supervised Learning: o First Determine the type of training dataset o Collect/Gather the labelled training data. o Split the training dataset into training dataset, test dataset, and validation dataset.
  • 5.
    o Determine theinput features of the training dataset, which should have enough knowledge so that the model can accurately predict the output. o Determine the suitable algorithm for the model, such as support vector machine, decision tree, etc. o Execute the algorithm on the training dataset. Sometimes we need validation sets as the control parameters, which are the subset of training datasets. o Evaluate the accuracy of the model by providing the test set. If the model predicts the correct output, which means our model is accurate. Types of supervised Machine learning Algorithms: Supervised learning can be further divided into two types of problems: 1. Regression Regression algorithms are used if there is a relationship between the input variable and the output variable. It is used for the prediction of continuous variables, such as Weather forecasting, Market Trends, etc. Below are some popular Regression algorithms which come under supervised learning: o Linear Regression o Regression Trees o Non-Linear Regression o Bayesian Linear Regression o Polynomial Regression 2. Classification
  • 6.
    Classification algorithms areused when the output variable is categorical, which means there are two classes such as Yes-No, Male-Female, True-false, etc. Spam Filtering, o Random Forest o Decision Trees o Logistic Regression o Support vector Machines Advantages of Supervised learning: o With the help of supervised learning, the model can predict the output on the basis of prior experiences. o In supervised learning, we can have an exact idea about the classes of objects. o Supervised learning model helps us to solve various real-world problems such as fraud detection, spam filtering, etc. Disadvantages of supervised learning: o Supervised learning models are not suitable for handling the complex tasks. o Supervised learning cannot predict the correct output if the test data is different from the training dataset. o Training required lots of computation times. o In supervised learning, we need enough knowledge about the classes of object. 2) Unsupervised Learning Unsupervised learning is a learning method in which a machine learns without any supervision. The training is provided to the machine with the set of data that has not been labeled, classified, or categorized, and the algorithm needs to act on that data without any supervision. The goal of unsupervised learning is to restructure the input data into new features or a group of objects with similar patterns. In unsupervised learning, we don't have a predetermined result. The machine tries to find useful insights from the huge amount of data.
  • 7.
    Why use UnsupervisedLearning? Below are some main reasons which describe the importance of Unsupervised Learning: o Unsupervised learning is helpful for finding useful insights from the data. o Unsupervised learning is much similar as a human learns to think by their own experiences, which makes it closer to the real AI. o Unsupervised learning works on unlabeled and uncategorized data which make unsupervised learning more important. o In real-world, we do not always have input data with the corresponding output so to solve such cases, we need unsupervised learning. Working of Unsupervised Learning Working of unsupervised learning can be understood by the below diagram: Here, we have taken an unlabeled input data, which means it is not categorized and corresponding outputs are also not given. Now, this unlabeled input data is fed to the machine learning model in order to train it. Firstly, it will interpret the raw data to find the hidden patterns from the data and then will apply suitable algorithms such as k-means clustering, Decision tree, etc. Once it applies the suitable algorithm, the algorithm divides the data objects into groups according to the similarities and difference between the objects.
  • 8.
    Types of UnsupervisedLearning Algorithm: The unsupervised learning algorithm can be further categorized into two types of problems: o Clustering: Clustering is a method of grouping the objects into clusters such that objects with most similarities remains into a group and has less or no similarities with the objects of another group. Cluster analysis finds the commonalities between the data objects and categorizes them as per the presence and absence of those commonalities. o Association: An association rule is an unsupervised learning method which is used for finding the relationships between variables in the large database. It determines the set of items that occurs together in the dataset. Association rule makes marketing strategy more effective. Such as people who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item. A typical example of Association rule is Market Basket Analysis. Unsupervised Learning algorithms: Below is the list of some popular unsupervised learning algorithms: o K-means clustering o KNN (k-nearest neighbors) o Hierarchal clustering o Anomaly detection o Neural Networks
  • 9.
    o Principle ComponentAnalysis o Independent Component Analysis o Apriori algorithm o Singular value decomposition Advantages of Unsupervised Learning o Unsupervised learning is used for more complex tasks as compared to supervised learning because, in unsupervised learning, we don't have labeled input data. o Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to labeled data. Disadvantages of Unsupervised Learning o Unsupervised learning is intrinsically more difficult than supervised learning as it does not have corresponding output. o The result of the unsupervised learning algorithm might be less accurate as input data is not labeled, and algorithms do not know the exact output in advance. The main differences between Supervised and Unsupervised learning are given below:
  • 10.
    Supervised Learning UnsupervisedLearning Supervised learning algorithms are trained using labeled data. Unsupervised learning algorithms are trained using unlabeled data. Supervised learning model takes direct feedback to check if it is predicting correct output or not. Unsupervised learning model does not take any feedback. Supervised learning model predicts the output. Unsupervised learning model finds the hidden patterns in data. In supervised learning, input data is provided to the model along with the output. In unsupervised learning, only input data is provided to the model. The goal of supervised learning is to train the model so that it can predict the output when it is given new data. The goal of unsupervised learning is to find the hidden patterns and useful insights from the unknown dataset. Supervised learning needs supervision to train the model. Unsupervised learning does not need any supervision to train the model. Supervised learning can be categorized in Classification and Regression problems. Unsupervised Learning can be classified in Clustering and Associations problems. Supervised learning can be used for those cases where we know the input as well as corresponding outputs. Unsupervised learning can be used for those cases where we have only input data and no corresponding output data. Supervised learning model produces an accurate result. Unsupervised learning model may give less accurate result as compared to supervised learning. Supervised learning is not close to true Artificial intelligence as in this, we first train the model for each data, and then only it can predict the correct output. Unsupervised learning is more close to the true Artificial Intelligence as it learns similarly as a child learns daily routine things by his experiences. It includes various algorithms such as Linear Regression, Logistic Regression, Support Vector Machine, Multi-class Classification, Decision tree, Bayesian Logic, etc. It includes various algorithms such as Clustering, KNN, and Apriori algorithm.
  • 11.
    3) Reinforcement Learning Reinforcementlearning is a feedback-based learning method, in which a learning agent gets a reward for each right action and gets a penalty for each wrong action. The agent learns automatically with these feedbacks and improves its performance. In reinforcement learning, the agent interacts with the environment and explores it. The goal of an agent is to get the most reward points, and hence, it improves its performance. The robotic dog, which automatically learns the movement of his arms, is an example of Reinforcement learning.