Yale University

11/01/2024 | Press release | Distributed by Public on 11/01/2024 19:16

A step toward clinically useful brain-behavior machine learning models

Relating brain activity to behavior is an ongoing aim of neuroimaging research as it would help scientists understand how the brain begets behavior - and perhaps open new opportunities for personalized treatment of mental health and neurological conditions. In some cases, scientists use brain images and behavioral data to train machine learning models to predict an individual's symptoms or illness based on brain function. But these models are only useful if they can generalize across settings and populations.

In a new study, Yale researchers show that predictive models can work well on datasets quite different from the ones the model was trained on. In fact, they argue that testing models in this way, on diverse data, will be essential for developing clinically useful predictive models.

"It is common for predictive models to perform well when tested on data similar to what they were trained on," said Brendan Adkinson, lead author of the study published recently in the journal Developmental Cognitive Neuroscience. "But when you test them in a dataset with different characteristics, they often fail, which makes them virtually useless for most real-world applications."

The issue lies in differences across datasets, which include variations in the age, sex, race and ethnicity, geography, and clinical symptom presentation among the individuals included in the datasets. But rather than viewing these differences as a hurdle to model development, researchers should see them as a key component, says Adkinson.

"Predictive models will only be clinically valuable if they can predict effectively on top of these dataset-specific idiosyncrasies," said Adkinson, who is an M.D.-Ph.D. candidate in the lab of senior author Dustin Scheinost, associate professor of radiology and biomedical imaging at Yale School of Medicine.

To test how well models can function across diverse datasets, the researchers trained models to predict two traits - language abilities and executive function - from three large datasets that were substantially different from each other. Three models were trained - one on each dataset - and then each model was tested on the other two datasets.

"We found that even though these datasets were markedly different from each other, the models still performed well by neuroimaging standards during testing," said Adkinson. "That tells us that generalizable models are achievable and testing on diverse dataset features can help."

Going forward, Adkinson is interested in exploring the idea of generalizability as it relates to a specific population.

The large-scale data collection efforts used for generating neuroimaging predictive models are based in metropolitan areas where researchers have access to more people. But building models exclusively on data collected from people living in urban and suburban areas runs the risk of creating models that don't generalize to people living in rural regions, the researchers say.

"If we get to a point where predictive models are robust enough to use in clinical assessment and treatment, but they don't generalize to specific populations, like rural residents, then those populations won't be served as well as others," said Adkinson, who comes from a rural area himself. "So we're looking at how to generalize models to rural populations."