Application of machine learning (ML) algorithms to structural magnetic resonance imaging (sMRI) data has yielded behaviorally meaningful estimates of the biological age of the brain (brain-age). The choice of the ML approach in estimating brain-age in youth is important because age-related brain changes in this age-group are dynamic. However, the comparative performance of the available ML algorithms has not been systematically appraised. To address this gap, the present study evaluated the accuracy (mean absolute error [MAE]) and computational efficiency of 21 machine learning algorithms using sMRI data from 2105 typically developing individuals aged 5–22 years from five cohorts. The trained models were then tested in two independent holdout datasets, one comprising 4078 individuals aged 9–10 years and another comprising 594 individuals aged 5–21 years. The algorithms encompassed parametric and nonparametric, Bayesian, linear and nonlinear, tree-based, and kernel-based models. Sensitivity analyses were performed for parcellation scheme, number of neuroimaging input features, number of cross-validation folds, number of extreme outliers, and sample size. Tree-based models and algorithms with a nonlinear kernel performed comparably well, with the latter being especially computationally efficient. Extreme Gradient Boosting (MAE of 1.49 years), Random Forest Regression (MAE of 1.58 years), and Support Vector Regression (SVR) with Radial Basis Function (RBF) Kernel (MAE of 1.64 years) emerged as the three most accurate models. Linear algorithms, with the exception of Elastic Net Regression, performed poorly. Findings of the present study could be used as a guide for optimizing methodology when quantifying brain-age in youth.
- brain age
- machine learning