What is the difference between SGD and feedback?

Can you tell me about the difference between SGD and feedback?

+5
source share
4 answers

Backpropagation is an efficient method of computational gradients in oriented computational graphs such as neural networks. This is not a teaching method, but rather a good computational trick that is often used in teaching methods . This is actually a simple implementation of the chain rule of derivatives, which just gives you the ability to calculate all the necessary partial derivatives in linear time in terms of the size of the graph (while calculating the naive gradient will scale exponentially with depth).

SGD is one of many optimization methods, namely a first-order optimizer , which means that it is based on an analysis of the gradient of an object. Therefore, from the point of view of neural networks, it is often used together with backprop for efficient updating. You can also apply SGD to gradients obtained differently (from a selection, numerical approximators, etc.). Symmetrically, you can use other optimization methods with backprop, as well as anything that the / jacobian gradient can use.

This common slip statement comes from the fact that for simplicity people sometimes say “trained with backprop,” which in fact means (if they don't specify an optimizer) “trained with SGD using backprop as a method of calculating the gradient.” In addition, in old tutorials you can find things like the “delta rule” and other somewhat confusing terms that describe exactly the same thing (since the neural network community has long been a bit independent of the general optimization community).

So you have two levels of abstraction:

  • gradient calculation - when backprop comes into play
  • optimization level - where methods are used, such as SGD, Adam, Rprop, BFGS, etc., which (if they are first order or higher) use the gradient calculated above.
+20
source

Gradient Descent is a model optimization method and a minimizing loss function.

In Stochastic Gradient Descent ( SGD ), you use 1 example for each iteration to update the weights of your model, depending on the error from this example, instead of using the average number of errors of ALL examples at each iteration .

and to calculate Stochastic Gradient Descent you need to calculate the gradient your model.

and here Backpropagation is an efficient gradient calculation method.

Thus, he often used Backpropagation with SGD to make your model effective.

Summary

Backpropagation is used to get the gradient (which gives you the direction in which you minimize the error), but does not give you how to use these gradients.

Gradient Descent use these gradients to update the weights of your model ( for example , by multiplying the gradient by a constant and adding them to the weights)

+1
source

With a lot of great answers, BackProp is just a trick for calculating a derivative with multiple variables, while SGD is a method of determining the minimum of your loss / cost function.

0
source

Stochastic gradient descent - a method of minimizing the objective function - it is often used in classification. The goal is to adjust the parameters so that the result of the function (in training cases) moves as quickly as possible to correction.

Backpropogation is a method of changing weights in a neural network. The error time is calculated for the NN output, and then the scales are adjusted in the reverse order.

These two methods are used to train a standard multilayer neural network.

-2
source

All Articles