Skip to content

MohamedASAK/Heart-Disease-Prediction-ML-Pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Heart Disease Prediction ML Pipeline

This project provides a comprehensive machine learning pipeline for heart disease prediction using the UCI Heart Disease dataset. It includes data preprocessing, feature selection, model training, evaluation, and a user-friendly Streamlit web app for real-time predictions.


Directory Structure

Heart Disease Prediction ML Pipeline/
│
├── ui/
│   └── src/
│       └── app.py                # Streamlit web app
│
├── models/
│   ├── final_model.pkl           # Trained ML model (Random Forest)
│   └── scaler.pkl                # Scaler used for preprocessing
│
├── deployment/
│   └── ngrok_setup.txt           # Instructions for exposing app via ngrok
│
├── results/
│   └── evaluation_metrics.txt    # Model performance metrics
│
├── Notebook.ipynb                # Main Jupyter notebook (ML pipeline)
├── requirements.txt              # Python dependencies
├── .gitignore                    # Files/folders to ignore in git
└── README.md                     # Project documentation

Getting Started

1. Clone the Repository

git clone <repo-url>
cd Heart Disease Prediction ML Pipeline

2. Install Dependencies

pip install -r requirements.txt

3. Run the Streamlit App

streamlit run ui/src/app.py

The app will open in your browser at http://localhost:8501.


Streamlit App

  • Location: ui/src/app.py
  • Features:
    • Enter patient details to predict heart disease risk.
    • Uses the trained Random Forest model and scaler.
    • Displays prediction and probability.

Model Training & Evaluation

  • Notebook: Notebook.ipynb

  • Pipeline Steps:

    • Data loading and preprocessing
    • Feature selection (RFE, Chi-Square)
    • Model training (Logistic Regression, Decision Tree, Random Forest, SVM)
    • Evaluation (Accuracy, Precision, Recall, F1, ROC AUC)
    • Hyperparameter tuning
    • Saving best model and scaler
  • Results:
    See results/evaluation_metrics.txt for detailed metrics.


Deployment

To expose your Streamlit app for external access, follow the instructions in deployment/ngrok_setup.txt using ngrok.


Requirements

See requirements.txt for all dependencies.

About

This project presents a comprehensive machine learning pipeline for predicting heart disease using the UCI Heart Disease dataset. It includes data preprocessing, feature selection, model training, evaluation, and a user-friendly Streamlit web app for real-time predictions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors