natural language understanding (NLU) transfer learning
X
Definition

backpropagation algorithm

What is a backpropagation algorithm?

Backpropagation, or backward propagation of errors, is an algorithm that is designed to test for errors working back from output nodes to input nodes. It's an important mathematical tool for improving the accuracy of predictions in data mining and machine learning. Essentially, backpropagation is an algorithm used to quickly calculate derivatives in a neural network, which are the changes in output because of tuning and adjustments.

There are two leading types of backpropagation networks:

  • Static backpropagation. Static backpropagation is a network developed to map static inputs for static outputs. Static networks can solve static classification problems, such as optical character recognition (OCR).
  • Recurrent backpropagation. The recurrent backpropagation network is used for fixed-point learning. This means that during neural network training, the weights are numerical values that determine how much nodes -- also referred to as neurons -- influence output values. They're adjusted so that the network can achieve stability by reaching a fixed value.

The key difference here is that static backpropagation offers instant mapping, while recurrent backpropagation does not.

Descriptions of AI apps including machine learning, deep learning and neural networks.
Find out how machine learning, deep learning and neural networks compare.

What is a backpropagation algorithm in a neural network?

Artificial neural networks (ANNs) and deep neural networks use backpropagation as a learning algorithm to compute a gradient descent, which is an optimization algorithm that guides the user to the maximum or minimum of a function.

In a machine learning context, the gradient descent helps the system minimize the gap between desired outputs and achieved system outputs. The algorithm tunes the system by adjusting the weight values for various inputs to narrow the difference between outputs. This is also known as the error between the two.

More specifically, a gradient descent algorithm uses a gradual process to provide information on how a network's parameters need to be adjusted to reduce the disparity between the desired and achieved outputs. An evaluation metric called a cost function guides this process. The cost function is a mathematical function that measures this error. The algorithm's goal is to determine how the parameters must be adjusted to reduce the cost function and improve overall accuracy.

In backpropagation, this error is propagated backward from the output layer or output neuron through the hidden layers toward the input layer so that neurons can adjust themselves along the way if they played a role in producing the error. Activation functions activate neurons to learn new complex patterns, information and whatever else they need to adjust their weights and biases, and mitigate this error to improve the network.

The algorithm gets its descent gradient name because the weights are updated backward, from output to input.

What is the objective of a backpropagation algorithm?

Backpropagation algorithms are used extensively to train feedforward neural networks, such as convolutional neural networks, in areas such as deep learning. A backpropagation algorithm is pragmatic because it computes the gradient needed to adjust a network's weights more efficiently than computing the gradient based on each individual weight. It enables the use of gradient methods, such as gradient descent and stochastic gradient descent, to train multilayer networks and update weights to minimize errors.

It's not easy to understand exactly how changing weights and biases affect the overall behavior of an ANN. That was one factor that held back more comprehensive use of neural network applications until the early 2000s, when computers provided the necessary insight.

Today, backpropagation algorithms have practical applications in many areas of artificial intelligence, including OCR, natural language processing and image processing.

Advantages and disadvantages of backpropagation algorithms

There are several advantages to using a backpropagation algorithm, but there are also challenges.

Advantages of backpropagation algorithms

  • They don't have any parameters to tune except for the number of inputs.
  • They're highly adaptable and efficient, and don't require prior knowledge about the network.
  • They use a standard process that usually works well.
  • They're user-friendly, fast and easy to program.
  • Users don't need to learn any special functions.

Disadvantages of backpropagation algorithms

  • They prefer a matrix-based approach over a mini-batch approach.
  • Data mining is sensitive to noisy data and other irregularities. Unclean data can affect the backpropagation algorithm when training a neural network used for data mining.
  • Performance is highly dependent on input data.
  • Training is time- and resource-intensive.

What is a backpropagation algorithm in machine learning?

Backpropagation is a type of supervised learning since it requires a known, desired output for each input value to calculate the loss function gradient, which is how desired output values differ from actual output. Supervised learning, the most common training approach in machine learning, uses a training data set that has clearly labeled data and specified desired outputs.

Along with classifier algorithms such as naive Bayesian filters, K-nearest neighbors and support vector machines, the backpropagation training algorithm has emerged as an important part of machine learning applications that involve predictive analytics. While backpropagation techniques are mainly applied to neural networks, they can also be applied to both classification and regression problems in machine learning. In real-world applications, developers and machine learning experts implement backpropagation algorithms for neural networks using programming languages such as Python.

What is the time complexity of a backpropagation algorithm?

The time complexity of each iteration -- how long it takes to execute each statement in an algorithm -- depends on the network's structure. In the early days of deep learning, a multilayer perceptron was a basic form of neural network consisting of an input layer, hidden units and an output unit. The time complexity was low compared with today's networks, which can have exponentially more parameters. Therefore, the sheer size of a neural network is the primary factor affecting time complexity, but there are other factors, such as the size of training data sets or the amount of data used to train networks.

Essentially, the number of neurons and parameters directly affects how backpropagation works. During a forward pass, in which input data moves forward from the input layer to the next layer and so on, the time complexity is larger when there are more neurons involved. During the subsequent backward pass, where parameters are adjusted to rectify an error, more parameters also mean more of a time complexity.

What is a backpropagation momentum algorithm?

Using gradient descent optimization algorithms for tuning weights to reduce an error can be time-consuming. That's why the concept of momentum in backpropagation is used to speed up this process. It states that previous weight changes must influence the present direction of movement in weight space. Simply put, an aggregate of past weight changes is used to influence a current one.

During optimization, it's possible for gradients to change direction, which would appear to complicate the overall process. That is why this momentum technique is used to ensure optimization continues moving in the right direction and the performance of the neural network improves.

What is a backpropagation algorithm pseudocode?

The backpropagation algorithm pseudocode is a basic blueprint that developers and researchers can use to conduct the backpropagation process. It's a high-level overview with plain language instructions as well as the code snippets to perform the most essential tasks in the process.

While this overview covers the essentials, the actual implementation typically is far more complex. The pseudocode covers the steps that need to get done; it typically reads like a sequential series of actions, and within it are all the core components that the backpropagation process will involve. Each pseudocode instance is pertinent to a specific context, and any common programming language can be used to write it, such as Python and other object-oriented programming languages.

What is the Levenberg-Marquardt backpropagation algorithm?

The Levenberg-Marquardt algorithm is another technique that helps adjust neural network weights and biases during training. However, within the context of training neural networks, it's not an alternative or replacement for a backpropagation algorithm, but rather an optimization technique used within backpropagation-based training.

To reduce neural network errors, Levenberg-Marquardt blends gradient information from the gradient descent method with insights from what is called the Gauss-Newton algorithm -- where gradient information is represented in a curved format using mathematical matrices -- as a method of guiding updates and speeding up what would take a traditional gradient descent method a longer time to complete.

Many other machine learning algorithms are also considered to be supervised machine learning. Read about how decision trees classify input data into smaller groups according to characteristics.

This was last updated in October 2023

Continue Reading About backpropagation algorithm

Dig Deeper on Machine learning platforms

Business Analytics
CIO
Data Management
ERP
Close