A guide to the types of machine learning algorithms (2024)

ByKatrina Wakefield, Marketing, SAS UK

A guide to machine learning algorithms and their applications

The term ‘machine learning’ is often,incorrectly, interchanged with Artificial Intelligence[JB1], but machine learning is actually a sub
field/type of AI. Machine learning is also often referred to as predictiveanalytics, or predictive modelling.

Coined by American computer scientistArthur Samuel in 1959, the term ‘machine learning’ is defined as a “computer’sability to learn without being explicitly programmed”.

At its most basic, machine learning usesprogrammed algorithms that receive and analyse input data to predict outputvalues within an acceptable range. As new data is fed to these algorithms, theylearn and optimise their operations to improve performance, developing ‘intelligence’over time.

There are four types of machine learning algorithms:supervised, semi-supervised, unsupervised and reinforcement.


In supervised learning, the machine istaught by example. The operator provides the machine learning algorithm with aknown dataset that includes desired inputs and outputs, and the algorithm mustfind a method to determine how to arrive at those inputs and outputs. While theoperator knows the correct answers to the problem, the algorithm identifiespatterns in data, learns from observations and makes predictions. The algorithmmakes predictions and is corrected by the operator – and this process continuesuntil the algorithm achieves a high level of accuracy/performance.

Under the umbrella of supervised learning fall: Classification, Regression and Forecasting.

  1. Classification: In classification tasks, the machinelearning program must draw a conclusion from observed values and determine to
    what category new observations belong. For example, when filtering emails as ‘spam’or ‘not spam’, the program must look at existing observational data and filterthe emails accordingly.
  2. Regression: In regression tasks, the machinelearning program must estimate – and understand – the relationships amongvariables. Regression analysis focuses on one dependent variable and a seriesof other changing variables – making it particularly useful for prediction andforecasting.
  3. Forecasting: Forecasting is the process of making predictions about the future based on the past and present data, and is commonly used to analyse trends.


Semi-supervised learning is similar tosupervised learning, but instead uses both labelled and unlabelled data.Labelled data is essentially information that has meaningful tags so that thealgorithm can understand the data, whilst unlabelled data lacks thatinformation. By using this
combination, machine learning algorithms can learn to label unlabelleddata.


Here, the machine learning algorithm studies data toidentify patterns. There is no answer key or human operator to provideinstruction. Instead, the machine determines the correlations and relationshipsby analysing available data. In an unsupervised learning process, the machine learning algorithmis left to interpret large data sets and address that data accordingly. Thealgorithm tries to organise that data in some way to describe its structure. Thismight mean grouping the data into clusters or arranging it in a way that looksmore organised.

As it assesses more data, its ability tomake decisions on that data gradually improves and becomes more refined.

Under the umbrella of unsupervisedlearning, fall:

  1. Clustering: Clustering involves grouping sets ofsimilar data (based on defined criteria). It’s useful for segmenting data intoseveral groups and performing analysis on each data set to find patterns.
  2. Dimension reduction: Dimension reduction reduces the number of variables being considered to find the exact information required.


Reinforcement learning focuses onregimented learning processes, where a machine learning algorithm is provided with a set of actions,parameters and end values. By defining the rules, the machine learning algorithm then tries toexplore different options and possibilities, monitoring and evaluating eachresult to determine which one is optimal. Reinforcement learning teaches themachine trial and error. It learns from past experiences and begins to adaptit* approach in response to the situation to achieve the best possible result.

Whatmachine learningalgorithms can you use?

Choosing the right machine learning algorithmdepends on several factors, including, but not limited to: data size, qualityand diversity, as well as what answers businesses want to derive from thatdata. Additional considerations include accuracy, training time, parameters,data points and much more. Therefore, choosing the right algorithm is both acombination of business need, specification, experimentation and timeavailable. Even the most experienced data scientists cannot tell you whichalgorithm will perform the best before experimenting with others. We have,however, compiled a machinelearning algorithm ‘cheatsheet’ which will helpyou find the most appropriate one for your specific challenges.

Whatare the most common and popular machine learning algorithms?

  • Naïve Bayes Classifier Algorithm(Supervised Learning - Classification)
    The Naïve Bayes classifier is based on Bayes’ theorem and classifies every value as independent of any other value. It allows us to predict a class/category, based on a given set of features, using probability.

    Despite its simplicity, the classifier does surprisingly well and is often used due to the fact it outperforms more sophisticated classification methods.

  • K Means Clustering Algorithm (Unsupervised Learning - Clustering)
    The K Means Clustering algorithm is atype of unsupervised learning, which is used to categorise unlabelled data,i.e. data without defined categories or groups. The algorithm works by findinggroups within the data, with the number of groups represented by the variable K.It then works iteratively to assign each data point to one of K groups based onthe features provided.
  • Support Vector Machine Algorithm (Supervised Learning - Classification)
    Support Vector Machine algorithms are supervised learning models that analyse data used for classification and regression analysis. They essentially filter data into categories, which is achieved by providing a set of training examples, each set marked as belonging to one or the other of the two categories. The algorithm then works to build a model that assigns new values to one category or the other.
  • Linear Regression (Supervised Learning/Regression)
    Linear regression is the most basic type of regression. Simple linear regression allows us to understand the relationships between two continuous variables.
  • Logistic Regression (Supervised learning – Classification)
    Logistic regression focuses on estimating the probability of an event occurring based on the previous data provided. It is used to cover a binary dependent variable, that is where only two values, 0 and 1, represent outcomes.
  • Artificial Neural Networks (Reinforcement Learning)
    An artificial neural network (ANN) comprises ‘units’ arranged in a series of layers, each of which connects to layers on either side. ANNs are inspired by biological systems, such as the brain, and how they process information. ANNs are essentially a large number of interconnected processing elements, working in unison to solve specific problems.

    ANNs also learn by example and throughexperience, and they are extremely useful for modelling non-linearrelationships in high-dimensional data or where the relationship amongst theinput variables is difficult to understand.

  • Decision Trees (Supervised Learning – Classification/Regression)
    A decision tree is a flow-chart-like tree structure that uses a branching method to illustrate every possible outcome of a decision. Each node within the tree represents a test on a specific variable – and each branch is the outcome of that test.
  • Random Forests (Supervised Learning – Classification/Regression)
    Random forests or ‘random decision forests’ is an ensemble learning method, combining multiple algorithms to generate better results for classification, regression and other tasks. Each individual classifier is weak, but when combined with others, can produce excellent results. The algorithm starts with a ‘decision tree’ (a tree-like graph or model of decisions) and an input is entered at the top. It then travels down the tree, with data being segmented into smaller and smaller sets, based on specific variables.
  • Nearest Neighbours (Supervised Learning)
    The K-Nearest-Neighbour algorithm estimates how likely a data point is to be a member of one group or another. It essentially looks at the data points around a single data point to determinewhat group it is actually in. For example, if one point is on a grid and thealgorithm is trying to determine what group that data point is in (Group A orGroup B, for example) it would look at the data points near it to see whatgroup the majority of the points are in.

    Clearly, there are a lot of things to consider when it comes to choosing the right machine learning algorithms for your business’ analytics. However, you don’t need to be a data scientist or expert statistician to use these models for your business. At SAS, our products and solutions utilise a comprehensive selection of machine learning algorithms, helping you to develop a process that can continuously deliver value from your data.

A guide to the types of machine learning algorithms (2024)


What are the 4 types of machine learning algorithms? ›

There are four types of machine learning algorithms: supervised, semi-supervised, unsupervised and reinforcement.

What algorithm does ChatGPT use? ›

The GPT in ChatGPT is mostly two related algorithms: GPT-3.5 Turbo and GPT-4, though the latter is only available in ChatGPT for ChatGPT Plus subscribers. The GPT bit stands for Generative Pre-trained Transformer, and the number is just the version of the algorithm.

How do I find the best machine learning algorithm? ›

Selecting the optimal machine learning algorithm involves considering data type, complexity, and goals. Start with simple models for small datasets, like linear regression. For structured data, decision trees or random forests work well.

What are the 5 types of machine learning? ›

Machine learning algorithms fall into five broad categories: supervised learning, unsupervised learning, semi-supervised learning, self-supervised and reinforcement learning.

What are 5 examples of algorithms? ›

Examples of Algorithms in Everyday Life
  • Tying Your Shoes. Any step-by-step process that is completed the same way every time is an algorithm. ...
  • Following a Recipe. ...
  • Classifying Objects. ...
  • Bedtime Routines. ...
  • Finding a Library Book in the Library. ...
  • Driving to or from Somewhere. ...
  • Deciding What to Eat.
Aug 18, 2022

Is GPT really AI? ›

Though it's accurate to describe the GPT models as artificial intelligence (AI), this is a broad description. More specifically, the GPT models are neural network-based language prediction models built on the Transformer architecture.

Do we have real AI yet? ›

At some point in the future there might be, but despite all of the hype, it's not imminent, and it certainly doesn't exist yet. We don't even have any good evidence that it's possible to create true AI, though, equally, there's no reason to believe that it isn't.

What does GPT stand for? ›

GPT, standing for Generative Pre-trained Transformer, is a powerful language model tool used to decipher and generate human-like text. Let's explore the nuts and bolts of how GPT is revolutionizing language processing.

Which ML algorithm is faster? ›

Figure 5 displays the fastest ML classification algorithms of this investigation where we can see that "Naïve Bayesian" and "Decision Tree" are the quickest on the training time an thus making them the most accurate selection for a real time task.

Which algorithm is best for prediction? ›

In this article, we will explore some of the top machine learning algorithms used to predict future probabilities, and how they work.
  1. 1 Logistic regression. ...
  2. 2 Naive Bayes. ...
  3. 3 K-nearest neighbors. ...
  4. 4 Decision trees. ...
  5. 5 Random forests. ...
  6. 6 Neural networks. ...
  7. 7 Here's what else to consider.
Sep 29, 2023

Which algorithm is most widely used in machine learning? ›

Decision Tree algorithm in machine learning is one of the most popular algorithm in use today; this is a supervised learning algorithm that is used for classifying problems. It works well in classifying both categorical and continuous dependent variables.

How do I choose the right model? ›

To choose the right model, you need to define the problem, consider the data, evaluate different models, consider model complexity, evaluate performance metrics, use cross-validation, consider regularization techniques, consider ensemble methods, and consider interpretability.

What is best first search in machine learning? ›

The Best First Search algorithm is used to find the shortest path from a starting node to a goal node in a graph. It prioritizes nodes based on their distance from the starting node and expands them in an ordered manner.

Which is the best first search algorithm in machine learning? ›

Best first search (BFS) is a search algorithm that functions at a particular rule and uses a priority queue and heuristic search. It is ideal for computers to evaluate the appropriate and shortest path through a maze of possibilities. Suppose you get stuck in a big maze and do not know how and where to exit quickly.

What are three 3 main categories of AI algorithms? ›

There are three major categories of AI algorithms: supervised learning, unsupervised learning, and reinforcement learning. The key differences between these algorithms are in how they're trained, and how they function.

What are 3 types of machine learning? ›

Machine learning involves showing a large volume of data to a machine to learn, make predictions, find patterns, or classify data. The three machine learning types are supervised, unsupervised, and reinforcement learning.

What is C4 5 algorithm in machine learning? ›

C4. 5 builds decision trees from a set of training data in the same way as ID3, using the concept of information entropy. The training data is a set.

What are the 3 broad categories for machine learning algorithms? ›

These can be divided into three main categories: supervised learning, unsupervised learning and reinforcement learning. Romain Huet, Senior Data Scientist at TMC, explains these different categories and when they can be used.

Top Articles
Latest Posts
Article information

Author: Greg O'Connell

Last Updated:

Views: 5570

Rating: 4.1 / 5 (42 voted)

Reviews: 89% of readers found this page helpful

Author information

Name: Greg O'Connell

Birthday: 1992-01-10

Address: Suite 517 2436 Jefferey Pass, Shanitaside, UT 27519

Phone: +2614651609714

Job: Education Developer

Hobby: Cooking, Gambling, Pottery, Shooting, Baseball, Singing, Snowboarding

Introduction: My name is Greg O'Connell, I am a delightful, colorful, talented, kind, lively, modern, tender person who loves writing and wants to share my knowledge and understanding with you.