Predicting Exome Sequencing Outcomes from Phenotypic Data in an African Patient Cohort with Developmental Delay

Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
University of the Witwatersrand, Johannesburg
Abstract
Global developmental delay (GDD) affects 1 – 3% of the worldwide population with up to 40% of cases due to underlying genetic factors. Exome sequencing (ES) is now recommended as the first-line genetic test for developmental delay (DD), however despite the advances of ES, the diagnostic yield remains limited between 30 – 40%. The use of machine learning predictive tools (MLPT) for phenotyping can potentially play a role in pre-selecting patients for ES and/or improve the phenotype guided analysis of ES data ultimately increasing the ES diagnostic yield. A data subset of 94 participants from the Deciphering Developmental Disorders (DDD)-Africa study was used to assess the performance of two MLPT, PredWES and Face2Gene (F2G). The study investigated if these MLPT can successfully predict the probability of a positive ES result and/or a genetic diagnosis based on the patient phenotype in an African cohort of participants with DD. This was done by investigating whether there is a correlation between; PredWES scores and ES outcomes, and F2G D-Scores and ES outcomes. In addition, F2G geneticist view was investigated and the list of 30 suggested syndromes with their corresponding gestalt scores, feature scores, and combined scores compared to the ES outcome of participants. For PredWES, using Human Phenotype Ontology (HPO) data to predict exome positivity, the diagnostic yield for the top 10% of PredWES scores was 44%, lower than the diagnostic yield of 46% seen without any PredWES prioritisation. The F2G D- Score has shown to be a poor predictor of ES outcome; however, the D-Score classification did correctly predict the need for further genetic testing for 77 – 80% of the ES positive group. F2G identified the correct genetic syndrome in 46% of the ES positive participants with a top ten accuracy of 35%. F2G provided syndrome suggestions with high scores for 39% of the ES negative participants. Overall, PredWES was shown to be ineffective in pre-selecting participants suitable for ES. The F2G performance was promising but still poorer than studies conducted on mainly European cohorts. The study provided valuable insight into the use of PredWES and F2G in an African setting and highlights the need for more diverse MLPT algorithm training and continued DD research in an African context
Description
A research report (in the format of a “submissible” paper) submitted in partial fulfillment of the requirements for the degree of Master of Science in Medicine (Genomic Medicine). to the Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, 2024
Keywords
Developmental Delay, Exome Sequencing, Face2Gene, Machine learning predictive tools, PredWES, UCTD
Citation
Bresler, Anréé. (2024). Predicting Exome Sequencing Outcomes from Phenotypic Data in an African Patient Cohort with Developmental Delay [Master’s dissertation, University of the Witwatersrand, Johannesburg]. WireDSpace.