Skip to content

AnnaToi01/machine_learning_for_healthcare

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Machine Learning for Healthcare

This repository contains the coursework for Machine Learning for Healthcare, taught at ETH Zurich in the Spring Semester 2025.

Group members: Anna Toidze, Gonzalo Cardenal Antolin, Tae Kim

Project 1 — ICU Time-Series Mortality Prediction

Project 1 centers on modeling ICU patient trajectories using the PhysioNet 2012 Challenge dataset. The goal is to predict in-hospital mortality based on the first 48 hours of multivariate time-series data. The project is structured as follows:

  • Data preparation & exploration: converting irregular clinical measurements into a consistent temporal representation, handling missingness, and examining variable distributions.
  • Supervised modeling: training a range of models including classic ML approaches, LSTMs, bidirectional RNNs, and Transformer-based architectures.
  • Representation learning: experimenting with self-supervised techniques, contrastive objectives, and evaluating embeddings via linear probes and visualization tools.
  • Foundation models: applying both small LLMs (via text-based summaries) and time-series foundation models such as Chronos for downstream prediction and embedding generation.

In the end, we summarize the model behavior, method trade-offs, and insights gained from working with real ICU data.

Project 2 — Explainability & Interpretability in Medical ML

Project 2 examines interpretability techniques for both structured and imaging data in clinical settings. It consists of three main components:

  • Tabular data analysis: using the Heart Failure Prediction dataset to study logistic models with L1 regularization, MLPs paired with SHAP explanations, and Neural Additive Models (NAMs) for inherently interpretable nonlinear modeling.
  • Medical imaging classification: training a CNN on the Chest X-Ray Pneumonia dataset and applying post-hoc attribution methods such as Integrated Gradients and Grad-CAM to understand spatial decision patterns.
  • Synthesis & evaluation: comparing explanation methods, assessing their reliability (including via sanity checks), and discussing how well they align with clinical intuition and practical deployment considerations.

About

This is a repository for the Machine Learning for Healthcare Course in ETH Zurich in SS25.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages