Mini Batch SGD
In Mini-batch SGD, we are taking benefits of both Gradient Descent and Stochastic Gradient Descent or SGD.
- We are showing the model a small portion (64, 128, 256, ....) of data at once
- This small portion is also known as Batch Size
In Mini-batch SGD, we are taking benefits of both Gradient Descent and Stochastic Gradient Descent or SGD.