Contrastive Learning of Medical Visual Representations from Paired Images and Text
Philip M¨uller, Georgios Kaissis, Congyu Zou, and Daniel R¨uckert. Joint learning of localized
representations from medical images and reports. arXiv preprint arXiv:2112.02889, 2021.
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive
predictive coding. arXiv preprint arXiv:1807.03748, 2018.
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini
Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning
transferable visual models from natural language supervision. In International Conference
on Machine Learning, 2021.
Maithra Raghu, Chiyuan Zhang, Jon Kleinberg, and Samy Bengio. Transfusion: Under-
standing transfer learning for medical imaging. In Advances in Neural Information Pro-
cessing Systems, 2019.
Pranav Rajpurkar, Jeremy Irvin, Aarti Bagul, Daisy Ding, Tony Duan, Hershel Mehta,
Brandon Yang, Kaylie Zhu, Dillon Laird, Robyn L Ball, et al. MURA: Large dataset
for abnormality detection in musculoskeletal radiographs. In 1st Conference on Medical
Imaging with Deep Learning (MIDL), 2018a.
Pranav Rajpurkar, Jeremy Irvin, Robyn L Ball, Kaylie Zhu, Brandon Yang, Hershel Mehta,
Tony Duan, Daisy Ding, Aarti Bagul, Curtis P Langlotz, et al. Deep learning for chest ra-
diograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing
radiologists. PLoS Medicine, 15(11):e1002686, 2018b.
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhi-
heng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. ImageNet large
scale visual recognition challenge. International Journal of Computer Vision, 115(3):
211–252, 2015.
Mert Bulent Sariyildiz, Julien Perez, and Diane Larlus. Learning visual representations
with caption annotations. In Proceedings of the 16th European Conference on Computer
Vision (ECCV), 2020.
George Shih, Carol C Wu, Safwan S Halabi, Marc D Kohli, Luciano M Prevedello, Tessa S
Cook, Arjun Sharma, Judith K Amorosa, Veronica Arteaga, Maya Galperin-Aizenberg,
et al. Augmenting the National Institutes of Health chest radiograph dataset with expert
annotations of possible pneumonia. Radiology: Artificial Intelligence, 1(1):e180041, 2019.
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional net-
works: Visualising image classification models and saliency maps. In ICLR Workshop,
2014.
Hari Sowrirajan, Jingbo Yang, Andrew Y Ng, and Pranav Rajpurkar. MoCo pretraining
improves representation and transferability of chest X-ray models. In Medical Imaging
with Deep Learning, pages 728–744. PMLR, 2021.
Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, and Jifeng Dai. VL-BERT:
Pre-training of generic visual-linguistic representations. In International Conference on
Learning Representations (ICLR), 2020.
19