Mini Batch SGD

In Mini-batch SGD, we are taking benefits of both Gradient Descent and Stochastic Gradient Descent or SGD.

  • We are showing the model a small portion (64, 128, 256, ....) of data at once
  • This small portion is also known as Batch Size