Diabetes Insight

Diabetes Insight is a demonstration project that generates a well-structured PDF report based on user input.
A pretrained machine learning model predicts whether a user is diabetic, or estimates their diabetes risk score. SHAP values are calculated and, together with the prediction, passed to gpt-5-mini, which produces a user-friendly, medically-styled explanation. This response is then formatted into a downloadable PDF report.

This project is designed as a practice exploration of how traditional machine learning, model explainability, and LLM engineering can work together in a healthcare-style application.
It is not intended for real clinical use.

Technologies Used

Frontend: Dash
Backend: FastAPI
Machine Learning: Custom-trained tree-based models (CatBoost + variants)
LLM Integration: gpt-5-mini with prompt engineering
PDF Generation: Python libraries for layout and export
Docker

Features

Generate a diabetes diagnosis report based on user input
Generate a diabetes risk score report
Automatic SHAP explainability
Automatic LLM-generated narrative for users
Cleanly formatted PDF download

Model Training

The training pipeline includes data cleaning, exploratory analysis, feature processing, model training, and SHAP explainability.
The following models were evaluated:

KNN
Logistic Regression
Decision Tree
Voting Classifier
Random Forest
Gradient Boosting
AdaBoost
Extra Trees
XGBoost
LightGBM
CatBoost

CatBoost showed slightly better performance and was selected for both final models.
Other tree-based methods performed similarly.

Note: Some training notebooks take considerable time to run, even on modern machines (late 2025).

Project Structure

Backend (`backend/`)

models/ — Serialized CatBoost classification models
routers/reports.py — Two API endpoints for generating reports
limiter.py — Simple rate limiter to prevent abuse
main.py — FastAPI initialization + CORS middleware
models.py — Pydantic request validation schemas
utils.py — Utility functions used in the endpoints

Frontend (`frontend/`)

frontend.ipynb — Dash application (UI built with Dash Bootstrap Components & Templates)

Training (`training/`)

Jupyter notebooks with EDA, preprocessing, and model training
models/ — Final models (mirrors backend/models)
parquet files — Processed datasets
diabetes_dataset.csv — Original Kaggle dataset
utils.py — Common helper functions
Final_training.ipynb — Re-training and SHAP generation workflow

Environment Variables

Include a .env file with:

- OPENAI_API_KEY=your_api_key
- API_URL=http://frontend:8050
- BACKEND_URL=http://backend:8000/reports

Notes

If you decide to use a different API_URL, make sure to update this line of code in frontend.py (and frontend.ipynb for consistency - not necessary):

if __name__ == '__main__':
    app.run(host="0.0.0.0", port=8050)

The BACKEND_URL currently includes the /reports prefix, since it is the only one called by the frontend in the current version of the app. If you decide to expand the app, make sure to exclude and, and update the relevant parts of the code in the frontend, namely these two:

response = requests.post(f"{BACKEND_URL}/diagnosis", json=input_data)

response = requests.post(f"{BACKEND_URL}/risk_score", json=input_data)

Installation

Requirements

Docker
Python 3.13 (optional)

Notes

This app uses some python packages, such as weasyprint, that tend to act differently on different machines. It is recommended you use Docker to run it, in order to avoid any potential issues.
Some of the packages used during training, such as cupy, act very differently on different machines, depending on both OS and hardware. F.x. cupy only works on Nvidia GPUs that support CUDA drivers. Hence, there is no requirements.txt with a list of all packages used in training, but the notebooks can be accessed nonetheless.

1. Clone the Repository

git clone https://github.com/Sebastijan-Dominis/diabetes-insight
cd diabetes-insight

2. Build a docker image

docker-compose build --no-cache

3. Run docker

docker compose up

4. Use the app

If you used the default ports, you can now access the frontend on localhost:8050, and the backend on localhost:8000/docs.

Screenshots

Below are examples of how the app looks and what the generated reports contain.

App use

Reports

Notes

SHAP values are computed on the backend for inference-time explainability.
The included dataset is synthetic and the project is for learning/demonstration only.
The frontend is intentionally simplified; in production, a React or Vue SPA would be preferable.

License

This repository includes a LICENSE file — please review it for terms of reuse.

Contributing

Improvements and bug fixes welcome. Open an issue or submit a pull request with a clear description of the change.

Author / Contact

Author: Sebastijan Dominis
Contact: sebastijan.dominis99@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.vscode		.vscode
__pycache__		__pycache__
backend		backend
frontend		frontend
img		img
training		training
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements-backend.txt		requirements-backend.txt
requirements-frontend.txt		requirements-frontend.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diabetes Insight

Table of Contents

Technologies Used

Features

Model Training

Project Structure

Backend (`backend/`)

Frontend (`frontend/`)

Training (`training/`)

Environment Variables

Notes

Installation

Requirements

Notes

1. Clone the Repository

2. Build a docker image

3. Run docker

4. Use the app

Screenshots

App use

Reports

Notes

License

Contributing

Author / Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Diabetes Insight

Table of Contents

Technologies Used

Features

Model Training

Project Structure

Backend (backend/)

Frontend (frontend/)

Training (training/)

Environment Variables

Notes

Installation

Requirements

Notes

1. Clone the Repository

2. Build a docker image

3. Run docker

4. Use the app

Screenshots

App use

Reports

Notes

License

Contributing

Author / Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Backend (`backend/`)

Frontend (`frontend/`)

Training (`training/`)

Packages