This notebook presents a step-by-step introduction to data mining using the Adult Income dataset from the UCI Machine Learning Repository. It was originally created as a project assignment for the Artificial Intelligence course taught by Prof. Mehdi Ghatee, under my supervision, in the Faculty of Computer Science at Amirkabir University.
The notebook cosists the following Parts:
-
Dataset Loading Loading the Adult Income dataset directly from UCI Machine Learning Repository. .
-
Exploratory Data Analysis (EDA) Performing numerical analysis and visual exploration to understand data structure, distributions, and correlations.
-
Data Preprocessing Handling missing values, removing duplicates, detecting outliers, and preparing features for modeling and etc.
-
Classifiers Introducing and running several machine learning classifiers such as:
- Support Vector Machines (SVM)
- K-Nearest Neighbors (KNN) and etc.
-
Model Evaluation Assessing model performance using Several Evaluation metrics, including:
- F1-Score
- Precision & Recall
- Confusion Matrix
- ROC & AUC curves and etc.
and much more!
Feedback, suggestions, and contributions are always welcome! Feel free to open an issue, leave a comment, or reach out if you're interested in further collaboration and Improvements.