Bioinformatics-driven development of a queryable cardiometabolic database and its application in a biological setting

Date
2017
Authors
Hendry, Liesl Mary
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
As sequencing and genotyping technologies are advancing, larger and more complex sets of biological data are being produced. Databases can be used to efficiently store and manage the data. Typically, publicly available datasets are accessed through web browsers that offer a user-friendly interface to a database, making complex queries simple to execute. However, research projectspecific data are not commonly stored in this way. In this research, a database (designed in MySQL) and accompanying interface (developed using PHP, HTML and CSS) has been designed for the storage and querying of the quality controlled data from the current project using Metabochip-genotyped Birth to Twenty (Bt20) cohort participants and their female caregivers. Users can easily access the data to generate summary statistics on the phenotype data and download phenotype, single nucleotide polymorphism (SNP) annotation and association analysis data that match user-supplied criteria. Some of the data from the database was used to investigate the genetics of blood pressure (BP) in black South African individuals. Hypertension is a major risk factor for cardiovascular diseases (CVDs). BP variation is known to have a genetic component, but genetic studies in indigenous Africans have been limited. Association analysis, carried out in a merged sample of caregivers and participants, pointed to novel regions of interest in the NOS1AP (DBP and SBP), MYRF (SBP) and POC1B (SBP) genes and two intergenic regions (DACH1|LOC440145 (DBP and SBP) and INTS10|LPL (SBP)). Two SNPs in the MYRF gene met the calculated “array-wide” significance threshold (p<6.7x10-7 for the merged dataset) for multiple testing. Genotype imputation is a useful addition to association studies to increase the SNP panel for association testing. An investigation into the efficiency of imputation in this dataset using a mixed population reference panel was carried out. Imputation was achieved with high confidence in all genes, but a more detailed view of the region was only seen in NOS1AP (DBP and SBP in both the merged and female caregiver datasets) and POC1B (Bt20 participant dataset only). Overall, the research contributed a useful tool for the efficient management of project-specific biological data. The analysis and genotype imputation, which is a promising tool in future studies in this or other African datasets, also provided some insight into the genetics of blood pressure in black South Africans with further functional and replication studies in larger samples required to confirm and explain the findings.
Description
A thesis submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg in fulfilment of the requirements for the degree of Doctor of Philosophy. June 2017, Johannesburg
Keywords
Citation
Hendry, Liesl Mary (2017) Bioinformatics-driven development of a queryable cardiometabolic database and its application in a biological setting, University of the Witwatersrand, Johannesburg, <http://hdl.handle.net/10539/23508>
Collections