FedMRG: Federated Medical Report Generation via Text-Aware Learning Rate Adjustment and Multi-Level Prototype Collaboration

Hichem Metmer1, Xiaoshan Yang2
1,2 State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS),
Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing, 100190, China,
School of Artificial Intelligence, University of Chinese Academy of Sciences (UCAS), Beijing, 101408, China.

Abstract

Medical report generation (MRG), which aims to automatically generate textual descriptions of medical images (e.g., chest X-rays), has gained significant research interest as a means to reduce the radiology reporting workload. However, existing MRG methods heavily rely on large-scale datasets, raising significant privacy concerns. In this paper, we introduce FedMRG, a Federated Medical Report Generation task that facilitates collaborative learning across multiple hospitals while preserving privacy. FedMRG addresses two key challenges: (1) text richness imbalance and (2) Feature contribution diversity. To tackle these challenges, we propose a novel two-step framework: (1) federated cross-modal pre-training and (2) fine-tuning with limited annotations. To address text richness imbalance issue, we introduce the Text-Aware Learning Rate Adjustment (TALRA) module, which ensures balanced participation from clients with varying levels of textual data richness. To tackle feature contribution diversity, we propose the Multi-Level Prototype Collaboration (MLPC) mechanism, which efficiently shares and integrates multi-level prototypes across various clients with different data modalities. Extensive experiments on four benchmark datasets demonstrate the effectiveness of the proposed method for MRG in a decentralized, yet collaborative learning environment.

Citation

If you find our paper useful for your research, please cite our paper:

@article{FedMRG, title = {FedMRG: Federated Medical Report Generation via Text-Aware Learning Rate Adjustment and Multi-Level Prototype Collaboration}, author = {Hichem Metmer and Xiaoshan Yang}, journal = {Multimedia Systems}, volume = {31}, pages = {170}, year = {2025}, publisher = {Springer}, doi = {10.1007/s00530-025-01725-5} }