Machine Learning Systems: An Overview

Machine learning (ML) is a thriving subfield of artificial intelligence (AI) that empowers computers to learn from data without explicit programming. It focuses on algorithms that can "learn" the patterns within training data and subsequently make accurate inferences about new data. In just the last five or 10 years, machine learning has become a critical way, arguably the most important way, most parts of AI are done. With the growing ubiquity of machine learning, everyone in business is likely to encounter it and will need some working knowledge about this field. A 2020 Deloitte survey found that 67% of companies are using machine learning, and 97% are using or planning to use it in the next year. From manufacturing to retail and banking to bakeries, even legacy companies are using machine learning to unlock new value or boost efficiency.

Machine Learning vs. Artificial Intelligence

Though “machine learning” and “artificial intelligence” are often used interchangeably, they are not quite synonymous. AI is a broader concept, defined as the capability of a machine to imitate intelligent human behavior. The goal of AI is to create computer models that exhibit “intelligent behaviors” like humans, according to Boris Katz, a principal research scientist and head of the InfoLab Group at CSAIL. This means machines that can recognize a visual scene, understand a text written in natural language, or perform an action in the physical world. Machine learning is one way to use AI.

The most elementary AI systems are a series of if-then-else statements, with rules and logic programmed explicitly by a data scientist. Unlike in expert systems, the logic by which a machine learning model operates isn’t explicitly programmed-it’s learned through experience. As the tasks an AI system is to perform become more complex, rules-based models become increasingly brittle: it’s often impossible to explicitly define every pattern and variable a model must consider.

The Mechanics of Machine Learning

Machine learning works through mathematical logic. Data points in machine learning are usually represented in vector form, in which each element (or dimension) of a data point’s vector embedding corresponds to its numerical value for a specific feature. For data modalities that are inherently numerical, such as financial data or geospatial coordinates, this is relatively straightforward.

The (often manual) process of choosing which aspects of data to use in machine learning algorithms is called feature selection. Feature extraction techniques refine data down to only its most relevant, meaningful dimensions. Both are subsets of feature engineering, the broader discipline of preprocessing raw data for use in machine learning.

For a practical example, consider a simple linear regression algorithm for predicting home sale prices based on a weighted combination of three variables: square footage, age of house and number of bedrooms. Here, A , B and C are the model parameters: adjusting them will adjust how heavily the model weighs each variable. The goal of machine learning is to find the optimal values for such model parameters: in other words, the parameter values that result in the overall function outputting the most accurate results.

Types of Machine Learning

There are three main subcategories of machine learning: supervised learning, unsupervised learning, and reinforcement learning. The end-to-end training process for a given model can, and often does, involve hybrid approaches that leverage more than one of these learning paradigms. For instance, unsupervised learning is often used to preprocess data for use in supervised or reinforcement learning.

Supervised Learning

Supervised machine learning models are trained with labeled data sets, which allow the models to learn and grow more accurate over time. Supervised learning trains a model to predict the “correct” output for a given input. It applies to tasks that require some degree of accuracy relative to some external “ground truth,” such as classification or regression.

Essential to supervised learning is the use of a loss function that measures the divergence (“loss”) between the model’s output and the ground truth across a batch of training inputs. Because this process traditionally requires a human in the loop to provide ground truth in the form of data annotations, it’s called “supervised” learning. As such, the use of labeled data was historically considered the definitive characteristic of supervised learning.

Regression Models

Regression models predict continuous values, such as price, duration, temperature or size. Examples of traditional regression algorithms include linear regression, polynomial regression and state space models. Linear regression is one of the simplest ways to predict numbers using a straight line. It helps find the relationship between input and output.

Read also: Revolutionizing Remote Monitoring

Classification Models

Classification models predict discrete values, such as the category (or class) a data point belongs to, a binary decision or a specific action to be taken. Examples of traditional classification algorithms include support vector machines (SVMs), Naïve Bayes and logistic regression. Many supervised ML algorithms can be used for either task. Logistic Regression is used when the output is a "yes or no" type answer. It helps in predicting categories like pass/fail or spam/not spam. Naïve Bayes is a quick and smart way to classify things based on probability. It works well for text and spam detection.

Unsupervised Learning

In unsupervised machine learning, a program looks for patterns in unlabeled data. Unsupervised learning trains a model to discern intrinsic patterns, dependencies and correlations in data. Unlike in supervised learning, unsupervised learning tasks don’t involve any external ground truth against which its outputs should be compared. They’re most useful in scenarios where such patterns aren’t necessarily apparent to human observers.

Clustering Algorithms

Clustering algorithms partition unlabeled data points into “clusters,” or groupings, based on their proximity or similarity to one another. They’re typically used for tasks like market segmentation or fraud detection. Prominent clustering algorithms include K-means clustering, Gaussian mixture models (GMMs) and density-based methods such as DBSCAN.

Association Algorithms

Association algorithms discern correlations, such as between a particular action and certain conditions. For instance, e-commerce businesses such as Amazon use unsupervised association models to power recommendation engines.

Dimensionality Reduction Algorithms

Dimensionality reduction algorithms reduce the complexity of data points by representing them with a smaller number of features-that is, in fewer dimensions-while preserving their meaningful characteristics. They’re often used for preprocessing data, as well as for tasks such as data compression or data visualization. One of the popular methods of dimensionality reduction is principal component analysis (PCA).

Read also: Boosting Algorithms Explained

Reinforcement Learning

Reinforcement machine learning trains machines through trial and error to take the best action by establishing a reward system. Reinforcement learning (RL) trains a model to evaluate its environment and take an action that will garner the greatest reward. They’re used prominently in robotics, video games, reasoning models and other use cases in which the space of possible solutions and approaches are particularly large, open-ended or difficult to define. Rather than the independent pairs of input-output data used in supervised learning, reinforcement learning (RL) operates on interdependent state-action-reward data tuples.

Key Elements of Reinforcement Learning

State Space: The state space contains all available information relevant to decisions that the model might make. The state typically changes with each action that the model takes.
Action Space: The action space contains all the decisions that the model is permitted to make at a moment. In a board game, for instance, the action space comprises all legal moves available at a given time. In text generation, the action space comprises the entire “vocabulary” of tokens available to an LLM.
Reward Signal: The reward signal is the feedback-positive or negative, typically expressed as a scalar value-provided to the agent as a result of each action. The value of the reward signal could be determined by explicit rules, by a reward function, or by a separately trained reward model.
Policy: A policy is the “thought process” that drives an RL agent’s behavior. In policy-based RL methods like proximal policy optimization (PPO), the model learns a policy directly. In value-based methods like Q-learning, the agent learns a value function that computes a score for how “good” each state is, then chooses actions that lead to higher-value states.

Deep Learning

Deep learning employs artificial neural networks with many layers-hence “deep”-rather than the explicitly designed algorithms of traditional machine learning. Deep learning is a subset of machine learning, which is focused on training artificial neural networks. With multiple layers, neural networks are inspired by the structure and function of the human brain. These complex algorithms excel at image and speech recognition, natural language processing and many other fields, by automatically extracting features from raw data through multiple layers of abstraction. Deep learning can handle datasets on a massive scale, with high-dimensional inputs.

Loosely inspired by the human brain, neural networks comprise interconnected layers of “neurons” (or nodes), each of which performs its own mathematical operation (called an “activation function”). The output of each node’s activation function serves as input to each of the nodes of the following layer and so on until the final layer, where the network’s final output is computed. Each connection between two neurons is assigned a unique weight: a multiplier that increases or decreases one neuron’s contribution to a neuron in the following layer. The backpropagation algorithm enables the computation of how each individual node contributes to the overall output of the loss function, allowing even millions or billions of model weights to be individually optimized through gradient descent algorithms. That distributed structure affords deep learning models their incredible power and versatility.

Convolutional neural networks (CNNs) add convolutional layers to neural networks. In mathematics, a convolution is an operation where one function modifies (or convolves) the shape of another. Recurrent neural networks (RNNs) are designed to work on sequential data. Whereas conventional feedforward neural networks map a single input to a single output, RNNs map a sequence of inputs to an output by operating in a recurrent loop in which the output for a given step in the input sequence serves as input to the computation for the following step. Transformer models, first introduced in 2017, are largely responsible for the advent of LLMs and other pillars of generative AI, achieving state-of-the-art results across most subdomains of machine learning. Like RNNs, transformers are ostensibly designed for sequential data, but clever workarounds have enabled most data modalities to be processed by transformers. Mamba models are a relatively new neural network architecture, first introduced in 2023, based on a unique variation of state space models (SSMs). Like transformers, Mamba models provide an innovative means of selectively prioritizing the most relevant information at a given moment.

Deep learning requires a great deal of computing power, which raises concerns about its economic and environmental sustainability.

Applications of Machine Learning

Machine learning is behind chatbots and predictive text, language translation apps, the shows Netflix suggests to you, and how your social media feeds are presented. It powers autonomous vehicles and machines that can diagnose medical conditions based on images. A 2020 Deloitte survey found that 67% of companies are using machine learning, and 97% are using or planning to use it in the next year. From manufacturing to retail and banking to bakeries, even legacy companies are using machine learning to unlock new value or boost efficiency.

More specifically, machine learning has use in:

Computer Vision: The subdomain of AI concerned with image data, video data other data modalities that require a model or machine to “see,” from healthcare diagnostics to facial recognition to self-driving cars.
Natural Language Processing (NLP): Spans a diverse array of tasks concerning text, speech and other language data. Notable subdomains of NLP include chatbots, speech recognition, language translation, sentiment analysis, text generation, summarization and AI agents. Natural language processing enables familiar technology like chatbots and digital assistants like Siri or Alexa.
Time Series Models: Applied anomaly detection, market analysis and related pattern recognition or prediction tasks.
Recommendation algorithms: Machine learning can analyze images for different information, like learning to identify people and tell them apart - though facial recognition algorithms are controversial. Business uses for this vary.
Fraud detection: Many companies are deploying online chatbots, in which customers or clients don’t speak to humans, but instead interact with a machine.
Self-driving cars: Much of the technology behind self-driving cars is based on machine learning, deep learning in particular.
Medical imaging and diagnostics: Machine learning programs can be trained to examine medical images or other information and look for certain markers of illness, like a tool that can predict cancer risk based on a mammogram.

Challenges and Considerations

While machine learning is fueling technology that can help workers or open new possibilities for businesses, there are several things business leaders should know about machine learning and its limits.

Explainability

One area of concern is what some experts call explainability, or the ability to be clear about what the machine learning models are doing and how they make decisions. Understanding why a model does what it does is actually a very difficult question, and you always have to ask yourself that. This is especially important because systems can be fooled and undermined, or just fail on certain tasks, even those humans can perform easily.

Bias and Unintended Outcomes

Machines are trained by humans, and human biases can be incorporated into algorithms - if biased information, or data that reflects existing inequities, is fed to a machine learning program, the program will learn to replicate it and perpetuate forms of discrimination.

Putting Machine Learning to Work

The way machine learning works for one company is probably not going to translate at another company. It’s also best to avoid looking at machine learning as a solution in search of a problem.

Machine Learning Pipeline

Careful curation and preprocessing of training data, as well as appropriate model selection, are crucial steps in the MLOps pipeline. Following deployment, models must be monitored for model drift, inference efficiency issues and other adverse developments.

Benefits of Machine Learning

Machine learning offers a wide range of benefits across various industries and applications.

Personalization and recommendations: By analysing user preferences and behaviour, machine learning powers personalized experiences.
Data analysis and pattern recognition: Machine learning excels at analysing large datasets to identify patterns and trends that may not be apparent through traditional methods.
Predictive analytics: Machine learning algorithms can make predictions based on historical data, anticipating future trends, customer behaviour and market dynamics.
Medical diagnosis and healthcare: Machine learning helps predict patient outcomes and personalize treatment plans.
Optimized resource allocation: Machine learning predicts demand, manages inventory and streamlines supply chain processes.
Improves data mining: Machine learning is excellent at data mining, which involves extracting useful information from large datasets. It takes this a step further by continuously improving its abilities over time, leading to more accurate insights and improved decision-making.
Enhances customer experiences: Adaptive interfaces, targeted content, chatbots, and voice-powered virtual assistants are all examples of how machine learning helps improve customer experiences. By analyzing customer behavior and preferences, machine learning personalizes interactions, provides timely and relevant information, and streamlines customer service.
Reduces risk: By continuously learning from new data, machine learning enhances its ability to detect and prevent fraud, providing robust protection against evolving threats. As fraud tactics evolve, machine learning adapts by detecting new patterns and preventing attempts before they succeed.
Anticipates customer behavior: Machine learning mines customer-related data to identify patterns and behaviors, helping sales teams optimize product recommendations and provide the best customer experiences possible. By continuously learning from new interactions, machine learning predicts future customer needs and preferences to support proactive and personalized engagement.
Reduces costs: Machine learning reduces costs by automating repetitive and time-consuming processes, allowing employees to focus on more strategic and higher-value tasks. Additionally, machine learning algorithms optimize resource allocation and minimize operational inefficiencies by analyzing large data sets and identifying areas for improvement. This leads to significant cost savings for businesses.

History of Machine Learning

To fully answer the question “what is machine learning?”, we must retrace our steps. ML can trace its origins back to the 1950s. The very first step in artificial intelligence and machine learning was taken by Arthur Samuel in 1950. His work demonstrated that computers were capable of learning when he taught a programme to play checkers. However, this wasn’t a programme that was explicitly designed to carry out specific commands. This programme could learn from past mistakes and moves to improve its performance. Only eight years later, in 1958, Frank Rosenblatt introduced the Perceptron, a simplified model of an artificial neuron. This algorithm could learn to recognize patterns in data and was the first iteration of an artificial neural network. Evgenii Lionudov and Aleksey Lyapunov would complement these innovations in the 1960s through their work on backpropagation algorithms and the theory of machine learning. Marvin Minsky and Seymour Papert’s Perceptrons, published in 1969, shone a bright light on the limitations of neural networks. John Hopfield would put an end to this “AI winter” with the introduction of his recurrent neural network - the Hopfield network - in 1982. This encouraged David Rumelhart, Geoffrey Hinton, Ronald Williams and others to revive the study of backpropagation and multi-layered neural networks. From there, a series of landmark breakthroughs followed. In 2014, Ian Goodfellow’s generative adversarial networks (GANs) would empower researchers to generate realistic synthetic data. In 2016, the world champion of Japanese board game Go was defeated by DeepMind’s AlphaGo system. Since then, the field has continued to develop deep learning architectures and expanded the applications of machine learning to industries like healthcare, finance and even entertainment. Amidst all this fast-paced progress, there is today a growing emphasis on considerations surrounding the responsible use of machine learning systems.

tags: #machine #learning #systems #overview