ML Interview
- #math
- #statistics
- Histogram
- Distribution ⭐️
- Uniform Distribution
- Normal Distribution
- Multivariate Normal Distribution
- Multinomial Normal Distribution
- Gaussian Distribution
- Exponential Distribution
- Binomial Distribution
- Poisson Distribution
- Population
- Mean
- Mode
- Median
- Variance
- Standard deviation
- Co-Variance
- Finding Co-relation between two data or distribution
- Pearson Correlation
- R-squared Value
- Mutual Information
- Cosine Similarity ⭐️
- Co-Variance
- Jaccard index
- Chi Squared Test
- Distance Metric
- Manhattan Distance
- Euclidian Distance
- Cosine Similarity
- Mahalanobis Distance
- Hamming Distance
- Chebychev Distance
- Hypothesis Testing
- Null Hypothesis
- Statistical Test
- p-value
- Odds
- Log(Odds)
- Odds Ratio
- Central Limit Theorem
- Quintile or Percentile
- Log Scale
- #probability
- #visualization
- #shallow-learning
- Supervised Learning
- Linear Regression ⭐️
- Polynomial Regression
- Bayesian Regression
- Logistic Regression ⭐️
- Multinomial Logistic Regression
- Perceptron ⭐️
- Multi Layer Perceptron ⭐️
- GLM
- LDA
- UMAP
- t-SNE
- Support Vector Machine (SVM) ⭐️
- SVR ⭐️
- SVC
- Kernel in SVM
- Polynomial Kernel
- Radial Basis Kernel
- Sigmoid Kernel
- K-nearest Neighbor (KNN)
- Decision Tree ⭐️
- GBM
- Adaboost
- XGBoost
- LightGBM
- CatBoost
- Pruning in Decision Tree
- Ensemble Learning
- Naive Bayes
- Gaussian ⭐️
- Multinomial ⭐️
- Bernouli
- Complement
- Categorical
- Markov Chain
- Unsupervised Learning
- Clustering
- K-means Clustering ⭐️
- Hierarchical Clustering
- DBScan Clustering
- HDBScan Clustering
- K-means vs. Hierarchical
- Spectral Clustering
- Gaussian Mixture Model
- Dimensionality Reduction
- Principal Component Analysis (PCA) ⭐️
- UMAP
- HeatMap
- t-SNE plots
- Autoencoder
- Association
- Apriori
- Expectation Minimization
- Clustering
- Semi-supervised Learning
- Recommendation
- Content Filtering ⭐️
- Collaborative Filtering ⭐️
- Metric Learning
- Learning to Rank
- Pointwise Learning to Rank
- Pairwise Learning to Rank
- Listwise Learning to Rank
- Probabilistic Graphical Model
- Conditional Random Field
- Bayessian Network
- Supervised Learning
- #deep-learning
- CNN
- RNN ⭐️
- LSTM ⭐️
- Bidirectional RNN or LSTM ⭐️
- GRU ⭐️
- Autoencoder
- Standard ⭐️
- Variational ⭐️
- PCA vs. Autoencoder
- Overcomplete Autoencoder
- Undercomplete Autoencoder
- Uses: ⭐️
- Anomaly Detection
- Autoencoder for Denoising Images
- Representation Learning
- Attention Reference
- Self Attention ⭐️
- Masked Self Attention ⭐️
- Multihead Self Attention ⭐️
- Encoder-Decoder Attention
- Factorized Self Attention
- Transformer
- Encoder-decoder ⭐️
- Encoder Only ⭐️
- Decoder Only ⭐️
- Contrastive Learning ⭐️
- Graph Convolutional Network (GCN) ⭐️
- Relational GCN
- Graph Attention Network
- Word Embedding
- Activation Function
- Optimizers Ref
- Gradient Descent ⭐️
- Stochastic Gradient Descent or SGD ⭐️
- Mini Batch SGD ⭐️
- Momentum
- Nesterov Momentum
- Adaptive Methods
- Adagrad
- Adadelta - avoid learning rate
- RMSProp
- Adam
- Adamax
- AMSGrad
- NADAM
- Generative Adversarial Network
- Genetic Algorithms
- Reinforcement Learning
- #loss-in-ml
- #evaluation
- Extrinsic Evaluation
- Intrinsic Evaluation
- Perplexity ⭐️
- Precision
- Recall
- Accuracy
- F1 Score ⭐️
- Sensitivity ⭐️
- Specificity ⭐️
- True Positive Rate
- False Positive Rate
- Confusion Matrix ⭐️
- Bias & Variance ⭐️
- AUC Score
- ROC Curve
- BLEU Score ⭐️
- ROUGE-N Score ⭐️
- ROUGE-L Score ⭐️
- Meteor Score
- BERTScore
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
- Root Mean Squared Error (RMSE)
- Mean Absolute Percentage Error (MAPE)
- R-squared Value
- Root Mean Squared Logarithmic Error (RMSLE)
- Regularization
- Misc.
- Machine Learning vs. Deep Learning
- Cross Validation
- Multi Class Classification
- Internal Covariate Shift
- Discriminative vs. Generative Models
- Kernel Regression
- One Class Classification
- One Class Gaussian
- One Class K-means
- One Class KNN
- One Class SVM
- Gumble Softmax ⭐️
- Normalization
- Data Normalization
- Batch Normalization
- Layer Normalization
- Generation
- Greedy Decoding ⭐️
- Beam Search ⭐️
- Random Sampling ⭐️
- Minimum Bayes Risk
- Handling Missing Data ⭐️
- Overfitting ⭐️
- Handling Imbalanced Dataset ⭐️
- SMOTE
- ADASYN
- Handling Outliers ⭐️
- Tokenizer
- BytePairEncoding ⭐️
- WordPiece
- SentencePiece
- Parametric vs Non Parametric ⭐️
- Model Based vs. Instance Based Learning ⭐️
- Swallow vs. Deep Learning ⭐️
- Parameter vs. Hyperparameter ⭐️
- Exploding Gradient ⭐️
- Vanishing Gradient ⭐️
- Hyperparameters
- Loss vs. Cost
- Gradient Clipping
- Grad accumulation
- Stemming
- Lemmatization
- Causality vs. Correlation
- Negative Sampling
- Data Augmentation
- Data Imputation
- Hinge Loss
- Feature Selection
- Framenet
- Wordnet
- Verbnet
- AMR Graph
- Transfer Learning
- Teacher Forcing ⭐️
- Student Forcing ⭐️
- Curriculum Learning ⭐️
- Weight Initialization
- Xavier
- Normal
- Learning Rate Scheduler ⭐️
- Fine Tuning Speedup
- LORA ⭐️
- Adapter ⭐️
- Hyper parameter finding
- Grid Search Hyperparameter Finding
- Random Search
- Bayesian Optimization Hyperparameter Finding
- Genetic Algorithm Hyperparameter Finding
- Gradient based techniques
- Different types of Learning
- Zero Shot Learning
- One Shot Learning
- Few Shot Learning
- Transfer Learning
- Active Learning
- Idea about SOTA Research
- ELBO
- End to End Machine Learning Pipeline
- Convex vs. Non-Convex
- Convex vs. Non-Convex Optimization
- One Hot Encoding
- LabelEncoding
- One Hot Encoding vs. Label Encoding
- Inductive Bias
- Selection Bias
- Type 1 error vs. Type 2 error