What is the potential problem of using the gradient obtained in the previous question with the stochastic gradient descent?
Added by Chad H.
Step 1
It updates the model parameters iteratively by taking small steps in the direction of the steepest descent, as defined by the negative gradient of the loss function. The key characteristic of SGD is that it uses a random subset of the training data (a mini-batch) Show more…
Show all steps
Your feedback will help us improve your experience
Sri K and 98 other AP CS educators are ready to help you.
Ask a new question
Labs
Want to see this concept in action?
Explore this concept interactively to see how it behaves as you change inputs.
Key Concepts
Recommended Videos
Interpret the significance of the gradient vector.
Sri K.
What are the consequences of a proton gradient and how could a gradient be used in the mitochondria?
Adi S.
Recommended Textbooks
Computer Science and Information Technology
Introduction to Programming Using Python
Computer Science - An Overview
Transcript
18,000,000+
Students on Numerade
Trusted by students at 8,000+ universities
Watch the video solution with this free unlock.
EMAIL
PASSWORD