MetroPT IsolationForest Simple App

This is a small, Streamlit-friendly project for training and serving an IsolationForest model on local MetroPT-3 data.

The project intentionally keeps the moving parts simple:

config.py holds shared paths, features, and thresholds.
setup_project.py checks folders, splits the dataset, and trains the first model if needed.
model.py trains, saves, loads, and predicts with an IsolationForest.
drift.py compares a current dataframe with the initial training dataframe.
api.py serves the saved model with FastAPI.
simulation.py runs the whole workflow and simulates one day, one month, or three months of sensor readings.
app/streamlit_app.py gives you a visual UI for the same flow.

Setup

Install dependencies:

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt

This project is intentionally built for the MetroPT-3 dataset as a CSV file. Put the file at:

dataset/MetroPT3.csv

If your file is somewhere else, edit DATASET_PATH in src/metropt_app/config.py.

The timestamp column name must be configured in src/metropt_app/config.py:

TIMESTAMP_COLUMN = "timestamp"

The project assumes the MetroPT-3 CSV contains the configured feature columns in FEATURES. It does not perform generic dataset-format detection, timestamp-column discovery, or missing-feature validation.

Run With Streamlit

streamlit run app/streamlit_app.py

Use the tabs in order:

Setup: split the dataset and check files.
Train: train and save the model.
Predict: start the API or run a local prediction.
Simulate: simulate one day, one month, or three months.
Logs: inspect simulation summaries.

The Train tab also shows recent MLflow runs. You can start the MLflow UI, inspect model history, and restore a previous logged model into models/isolation_forest.joblib.

Run From The Command Line

Run a one-day local simulation:

set PYTHONPATH=src
python -m metropt_app.simulation --mode day

Run a one-month local simulation:

set PYTHONPATH=src
python -m metropt_app.simulation --mode month

Run a three-month local simulation:

set PYTHONPATH=src
python -m metropt_app.simulation --mode three_months

Run known failure-period simulations:

set PYTHONPATH=src
python -m metropt_app.simulation --mode failure_2020_04_18
python -m metropt_app.simulation --mode failure_2020_05_29_30
python -m metropt_app.simulation --mode failure_2020_07_15

Start the API manually:

set PYTHONPATH=src
uvicorn metropt_app.api:app --host 127.0.0.1 --port 8000

Run a one-month simulation through an already running API:

set PYTHONPATH=src
python -m metropt_app.simulation --mode month --use-api

API docs will be at:

http://127.0.0.1:8000/docs

MLflow Model Tracking

Every model training run is logged to MLflow under:

mlruns/

The project logs:

model parameters
training row count
selected feature list
training anomaly ratio
training score summary statistics
the trained sklearn model artifact

Start the MLflow UI with:

set PYTHONPATH=src
set MLFLOW_ALLOW_FILE_STORE=true
mlflow ui --backend-store-uri ./mlruns --port 5000

Older logged models can be restored from the Streamlit Train tab by selecting a run and clicking Restore selected MLflow model.

Drift Behavior

The simulation checks drift in non-overlapping 21,600 row windows. Each window is compared with the full first month of training data. Drift is reported if any feature drifts.

Retraining is calendar-month based: drift checks during a simulated month contribute to one monthly retrain decision. If at least one window in that month says should_retrain, the model may retrain once at the end of the simulated month.

Retraining is blocked if any window in that month is marked as possible_failure, because failure-like data should not be learned as normal behavior. When retraining is allowed, the model trains only on rows from the last month that the current model predicted as normal.

Each drift window also logs its anomaly ratio. If more than 0.99 of the rows in a window are anomalies, the window is marked as possible_failure.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
notebooks		notebooks
src/metropt_app		src/metropt_app
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MetroPT IsolationForest Simple App

Setup

Run With Streamlit

Run From The Command Line

MLflow Model Tracking

Drift Behavior

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MetroPT IsolationForest Simple App

Setup

Run With Streamlit

Run From The Command Line

MLflow Model Tracking

Drift Behavior

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages