• XGBoost stands for eXtreme Gradient Boosting


  1. Get initial_guess, unlike Gradient Boosting, its always 0.5
  2. Start with one node which has all the residuals of the datapoint
  3. Get similarity score for that node
  4. Split the node
    1. Get similarity score for each leaf
    2. Calculate gain for that split, $gain_{split} = similarity_{left} + similarity_{right} - similarity_{root}$
    3. Go to 4 and continue splitting till predetermined number of depth is reached (usually 6)
    4. Prune the tree
      1. Calculate $gain - \gamma$ for the lowest branch
      2. If its negative remove the branch
      3. And continue go up till one has positive value
  5. Get $current_guess = log_odds(0.5) + lr * current_prediction$
  6. $probability = \frac{\exp(current_guress)}{1 + \exp(current_guress)}$
  7. Got to step 2, until a predetermined number of estimator is reached


  1. Good at Handling Missing Data
  2. Performs well on dataset from small to large, complicated dataset


  1. Bad at Handling Outliers