Development of an analysis pipeline for HLA genotyping using illumina short reads

dc.contributor.authorBird, James
dc.date.accessioned2019-09-10T13:03:05Z
dc.date.available2019-09-10T13:03:05Z
dc.date.issued2019
dc.descriptionA dissertation submitted to the Faculty of Science, University of the Witwatersrand, in fulfillment of the requirements for the degree of Master of Science. Johannesburg March, 2019en_ZA
dc.description.abstractHuman leukocyte antigens are highly polymorphic loci located on chromosome six. This region is the most polymorphic region within the human genome, and as such, genotyping alleles in this region is problematic. Furthermore, the required resolution of genotyping is dependent on the application. For instance, organ transplants require two-digit resolution for kidney, and a minimum of four-digit resolution for bone marrow, while population disease related studies often require six-digit resolution. As specialized HLA genotyping tools have been developed which utilize NGS data, the aim of this study was to compare four HLA genotyping tools, namely - BWAkit, xHLA, Kourami and HISAT-Genotype, and to evaluate whether population-specific HLA variability would affect their accuracy. The accuracy of the tools were compared to Sanger sequenced HLA data, where exons 2 and 3 were sequenced for HLA class I. As exons 2 and 3 were available as a reference from the Sanger sequencing, an accurate allele call was determined on its similarity to the reference data. It was found that at the two- and four-digit resolution, xHLA was the most accurate, which was due to the inclusion of a nucleotide-to-protein alignment step in the algorithm. Kourami was the most accurate at the six-digit resolution due to the use of alternate loci, in the alignment step. To further identify possible error trends, the allele sequences produced by the tools were analyzed. It was found that the majority of errors occurred at heterozygous positions, where false homozygous positions were identified. It was also noted that, with the exception of HISAT-Genotype, each tool was most accurate at HLA-B, and least accurate at HLA-C. From evaluating HLA population-specific variability, it was found that the four super-populations tested African, Asian, European and South American, did not significantly vary, in regards to HLA variability. It was, however, found that the different loci differed significantly from each other. Therefore, in conclusion, future improvements include varying the parameters when genotyping different loci. Currently, however, a consensus approach using xHLA and Kourami should be utilizeden_ZA
dc.description.librarianMT 2019en_ZA
dc.format.extentOnline resource (189 leaves)
dc.identifier.citationBird, James Andrew (2019) Development of an analysis pipeline for HLA genotyping using illumina short reads, University of the Witwatersrand, Johannesburg, <http://hdl.handle.net/10539/28076>
dc.identifier.urihttps://hdl.handle.net/10539/28076
dc.language.isoenen_ZA
dc.subject.lcshLA histocompatibility antigens
dc.subject.lcshImmunotherapy
dc.titleDevelopment of an analysis pipeline for HLA genotyping using illumina short readsen_ZA
dc.typeThesisen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
James_Bird_Dissertation.pdf
Size:
4.57 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections