A tour through classical machine learning — implemented and explained from the ground up. Each notebook tackles a real dataset, and this README walks you through what each method actually does, in plain language.
- Author: Mohammad Javad Maheronnaghsh
📘 New to ML? Every method below starts with a "💡 In plain English" box — no formulas, no jargon — so you can build intuition before touching the code.
Machine learning is about learning patterns from data instead of being explicitly programmed. The two main families:
- Supervised learning — you have labeled examples (input → correct answer) and the model learns to predict the answer for new inputs. Splits into:
- Regression → predict a number (e.g. house price).
- Classification → predict a category (e.g. spam / not-spam).
- Unsupervised learning — no labels; the model finds structure on its own (e.g. grouping similar items, compressing data).
| Folder | Methods | Type |
|---|---|---|
Regression/ |
Linear · Lasso/Ridge · Polynomial & Splines | Supervised (number) |
Classification/ |
Logistic Regression · Decision Tree · Bagging · AdaBoost · SVM · Neural Nets | Supervised (category) |
Classification/ (PCA, K-Means) |
PCA · K-Means | Unsupervised |
LDA and QDA/ |
Linear & Quadratic Discriminant Analysis | Supervised (category) |
NLP/ |
Naive Bayes text classification | Supervised (text) |
📚 The classic textbook An Introduction to Statistical Learning (ISLR) is included as a reference.
💡 In plain English: Regression draws the "best-fit" relationship between inputs and a numeric output, so you can predict that output for new inputs. Linear regression fits a straight line; if the line bends, you need a curve.

Insurance data — how charges relate to age. Regression learns the trend through points like these.
- Linear Regression — the foundational straight-line fit.
- Lasso & Ridge — linear regression with regularization: a penalty that keeps the model simple to avoid overfitting (memorizing noise instead of learning the trend). Lasso can even switch off useless features entirely.
- Polynomial Regression & Splines — fit curves instead of lines for non-linear data. But more flexibility isn't always better:

Choosing complexity: test error vs. polynomial degree. Too simple underfits; too complex overfits — the sweet spot is in between.
💡 In plain English: Classification sorts inputs into buckets. Logistic regression — despite the name — is a classifier: it outputs the probability that something belongs to a class (e.g. "85% likely diabetic").

Before modeling, explore the data — here a correlation heatmap of the Pima diabetes features shows which ones move together.
- Logistic Regression — probabilistic linear classifier (on the diabetes dataset).
- Decision Tree — a flowchart of yes/no questions that splits data into classes.
- Bagging — train many trees on random subsets and average them to reduce variance (this is the idea behind Random Forests).
- AdaBoost — train models in sequence, each one focusing on the mistakes of the last (boosting).
💡 Why ensembles (Bagging / Boosting)? One model can be wrong; a committee of models that vote is usually more accurate and stable.
💡 Support Vector Machine (SVM): finds the boundary that separates two classes with the widest possible margin — the cleanest dividing line.
💡 PCA (Principal Component Analysis): a way to compress data by keeping only its most important directions of variation. Applied to face images, the "principal components" look like ghostly template faces (eigenfaces) that can be combined to reconstruct any face.

Left: an original face. Right: a PCA component ("eigenface") — a reusable building block PCA learns from the dataset.
💡 K-Means (clustering): an unsupervised method that groups data into K clusters by repeatedly assigning points to the nearest cluster center and updating the centers.
💡 Neural Networks: layers of simple units that together learn complex patterns — the foundation of deep learning.
📂 Classification/SVM - PCA - KMeans - Neural Netwroks.ipynb
💡 In plain English: LDA and QDA classify by modeling what each class's data looks like (its distribution) and asking "which class most likely produced this point?" LDA assumes a straight boundary between classes; QDA allows a curved one.
💡 In plain English: To classify text (e.g. positive vs. negative review), Naive Bayes counts how typical each word is for each class and multiplies the evidence together. It's "naive" because it pretends words are independent — yet it works remarkably well. Often paired with TF-IDF, which weights words by how informative they are.
📂 NLP/
A model improves by minimizing a loss (a score of how wrong it is). Which loss you use depends on the task:
| Loss | Used for | Intuition |
|---|---|---|
| MSE (Mean Squared Error) | Regression | Penalizes big errors quadratically |
| Cross-Entropy (binary / categorical) | Classification | Punishes confident wrong answers |
| Hinge Loss | SVMs | Rewards a wide separating margin |
| Logistic Loss | Logistic Regression | Probabilistic version of classification error |
💡 Good to know: most loss functions come from Maximum Likelihood Estimation (MLE) — e.g. Cross-Entropy arises from the Bernoulli distribution, and MSE from the Normal (Gaussian) distribution.
- Logistic Regression is classification, not regression — the name is historical.
- The three classical linear regressions are Linear, Lasso, and Ridge.
- Regularization (Lasso/Ridge) is the main tool against overfitting.
- What is ML, in simple words (English)
- Papers With Code — methods catalog
- ML in simple words (Persian)
- Dr. Sharifi Zarchi & Mr. Azarkhalili's ML course (Sharif University)
git clone https://github.com/mjmaher987/Machine-Learning.git
cd Machine-Learning
pip install numpy pandas scikit-learn matplotlib seaborn
# open any notebook, e.g.:
jupyter notebook "Regression/Linear Regression.ipynb"If you're new, follow the folders in order: Regression → Classification → SVM/PCA/K-Means → LDA/QDA → NLP.
- Upload lecture notes & useful slides
- Add curated courses (videos) with assignments and solutions
- Grow into a question bank for teaching assistants
