This repository contains my research work at my summer internship at LiGHT lab (EPFL), focusing on scaling the MultiMeditron training pipeline for 70B-parameter models. MultiMeditron is a multimodal medical language model based on the LLaVA architecture. This project explores optimization techniques to efficiently train large-scale versions using distributed computing and DeepSpeed.
MultiMeditron_Simon_Lefort_Summer25.typ- Complete research report (Typst format)MultiMeditron_Simon_Lefort_Summer25.pdf- Generated PDF reportscripts/- Training scripts for RCP and CSCS clustersassets/- Supporting images and diagrams
Simon Lefort, supervised by Michael Zhang and Annie Hartley at LiGHT Lab, EPFL.