TY - JOUR
T1 - Data-driven consideration of genetic disorders for global genomic newborn screening programs
AU - ICoNS Gene List Contributors
AU - International Consortium on Newborn Sequencing (ICoNS)
AU - Minten, Thomas
AU - Bick, Sarah
AU - Adelson, Sophia
AU - Gehlenborg, Nils
AU - Amendola, Laura M.
AU - Boemer, François
AU - Coffey, Alison J.
AU - Encina, Nicolas
AU - Ferlini, Alessandra
AU - Kirschner, Janbernd
AU - Russell, Bianca E.
AU - Servais, Laurent
AU - Sund, Kristen L.
AU - Taft, Ryan J.
AU - Tsipouras, Petros
AU - Zouk, Hana
AU - Zygmunt, Aldona
AU - Ververi, Athina
AU - Siu, Carol
AU - Ponzi, Emanuela
AU - Bertini, Enrico
AU - Xinwen, Huang
AU - King, Jovanka
AU - Kassahn, Karin
AU - Koutsogianni, Maria
AU - Valente, Maria Luisa
AU - Pelo, Matthew J.
AU - Gentile, Mattia
AU - Orsini, Paola
AU - Ficarella, Romina
AU - Sansen, Stefaan
AU - Rui, Xiao
AU - Zhengyan, Zhao
AU - Bick, David
AU - Goldenberg, Aaron
AU - Satija, Aditi
AU - Lundquist, Alberte
AU - Ferlini, Alessandra
AU - Wiedemann, Alexandra
AU - Tuff-Lacey, Alice
AU - Al-Maraghi, Aljazi
AU - Pichini, Amanda
AU - Akil, Ammira Alshabeeb
AU - Brower, Amy
AU - Gaviglio, Amy
AU - Ponte, Amy
AU - Oza, Andrea
AU - Posch, Andreas
AU - Webb, Bryn
AU - Wasserstein, Melissa
N1 - Publisher Copyright:
© 2025 American College of Medical Genetics and Genomics
PY - 2025/7
Y1 - 2025/7
N2 - Purpose: Over 30 international studies are exploring newborn sequencing (NBSeq) to expand the range of genetic disorders included in newborn screening. Substantial variability in gene selection across programs exists, highlighting the need for a systematic approach to prioritize genes. Methods: We assembled a data set comprising 25 characteristics about each of the 4390 genes included in 27 NBSeq programs. We used regression analysis to identify several predictors of inclusion and developed a machine learning model to rank genes for public health consideration. Results: Among 27 NBSeq programs, the number of genes analyzed ranged from 134 to 4299, with only 74 (1.7%) genes included by over 80% of programs. The most significant associations with gene inclusion across programs were presence on the US Recommended Uniform Screening Panel (inclusion increase of 74.7%, CI: 71.0%-78.4%), robust evidence on the natural history (29.5%, CI: 24.6%-34.4%), and treatment efficacy (17.0%, CI: 12.3%-21.7%) of the associated genetic disease. A boosted trees machine learning model using 13 predictors achieved high accuracy in predicting gene inclusion across programs (area under the curve = 0.915, R2 = 84%). Conclusion: The machine learning model developed here provides a ranked list of genes that can adapt to emerging evidence and regional needs, enabling more consistent and informed gene selection in NBSeq initiatives.
AB - Purpose: Over 30 international studies are exploring newborn sequencing (NBSeq) to expand the range of genetic disorders included in newborn screening. Substantial variability in gene selection across programs exists, highlighting the need for a systematic approach to prioritize genes. Methods: We assembled a data set comprising 25 characteristics about each of the 4390 genes included in 27 NBSeq programs. We used regression analysis to identify several predictors of inclusion and developed a machine learning model to rank genes for public health consideration. Results: Among 27 NBSeq programs, the number of genes analyzed ranged from 134 to 4299, with only 74 (1.7%) genes included by over 80% of programs. The most significant associations with gene inclusion across programs were presence on the US Recommended Uniform Screening Panel (inclusion increase of 74.7%, CI: 71.0%-78.4%), robust evidence on the natural history (29.5%, CI: 24.6%-34.4%), and treatment efficacy (17.0%, CI: 12.3%-21.7%) of the associated genetic disease. A boosted trees machine learning model using 13 predictors achieved high accuracy in predicting gene inclusion across programs (area under the curve = 0.915, R2 = 84%). Conclusion: The machine learning model developed here provides a ranked list of genes that can adapt to emerging evidence and regional needs, enabling more consistent and informed gene selection in NBSeq initiatives.
KW - Gene selection
KW - Gene-disorder associations
KW - Genomic sequencing
KW - Machine learning
KW - Newborn screening
UR - https://www.scopus.com/pages/publications/105006735005
U2 - 10.1016/j.gim.2025.101443
DO - 10.1016/j.gim.2025.101443
M3 - Article
C2 - 40357684
AN - SCOPUS:105006735005
SN - 1098-3600
VL - 27
JO - Genetics in Medicine
JF - Genetics in Medicine
IS - 7
M1 - 101443
ER -