2024
Identifying biases in a multicenter MRI database for Parkinson’s disease classification: Is the disease classifier a secret site classifier?
Authors:
Souza, R., Winder, A., Stanley, E. A. M., Vigneshwaran, V., Camacho, M., Camicioli, R., Monchi, O., Wilms, M., & Forkert, N. D.
Journal:
IEEE Journal of Biomedical and Health Informatics
Abstract
Sharing multicenter imaging datasets can be advantageous to increase data diversity and size but may lead to spurious correlations between site-related biological and non-biological image features and target labels, which machine learning (ML) models may exploit as shortcuts. To date, studies analyzing how and if deep learning models may use such effects as a shortcut are scarce. Thus, the aim of this work was to investigate if site-related effects are encoded in the feature space of an established deep learning model designed for Parkinson’s disease (PD) classification based on T1-weighted MRI datasets. Therefore, all layers of the PD classifier were frozen, except for the last layer of the network, which was replaced by a linear layer that was exclusively re-trained to predict three potential bias types (biological sex, scanner type, and originating site). Our findings based on a large database consisting of 1880 MRI scans collected across 41 centers show that the feature space of the established PD model (74% accuracy) can be used to classify sex (75% accuracy), scanner type (79% accuracy), and site location (71% accuracy) with high accuracies despite this information never being explicitly provided to the PD model during original training. Overall, the results of this study suggest that trained image-based classifiers may use unwanted shortcuts that are not meaningful for the actual clinical task at hand. This finding may explain why many image-based deep learning models do not perform well when applied to data from centers not contributing to the training set.
Share

Receive the
latest news
Stay updated with the latest research developments from CCNA-CCNV. Our news section provides insights into cutting-edge studies, advancements in dementia care, and key findings in brain health research.




