Handling Outliers

[!question] How to find Outliers?

  1. Using standard deviation if the data is normally distributed
  1. Using Z-score
  2. Using Box Plot
  3. Using Interquartile Range (IQR)
  4. Using Quintile or Percentile
  5. Algorithms to detect outliers

[!def] Machine Learning algorithms Sensitive to outliers

  1. Linear Regression
  2. Logistic Regression
  3. Support Vector Machine (SVM) (Hard Margin)
  4. K-nearest Neighbor (KNN)
  5. K-means Clustering
  6. Hierarchical Clustering
  7. Principal Component Analysis (PCA)

[!def] Machine Learning algorithms NOT Sensitive to outliers

  1. Decision Tree
  2. Random Forest
  3. Support Vector Machine (SVM) (Soft Margin in SVM)
  4. XGBoost
  5. AdaBoost
  6. Naive Bayes


  1. how the algorithms handle outliers