Binary classification: Heart Disease
- Technologies used: Scikit-Learn, Python, numpy, pandas, Matplotlib, Seaborn, XGBoost
- Github URL: Project Link
For this project, a binary classification model is trained on the Heart Disease dataset in
order to predict the likelihood of a heart disease using some key indicators.
4 models were trained: a Basic Logistic Regression Model, a Decision Tree, a Random Forest and a
Gradient Boosting model. Out of the 4, Gradient Boosting was the model with the better performance
(using the XGBoost library). The trained model is provided in the file
xgb_model_eta=0.1_max_depth=6_min_child_weight=1.bin, which can be loaded with pickle.
The exploratory data analysis and model selection was done with the help of a Jupyter Notebook,
notebook.ipynb.
The model training script was exported to train.py.
A Flask app was created in predict.py, which can be deployed with any WSGI server. This project has
been developed and tested with Gunicorn.