Development of an analysis pipeline for HLA genotyping using illumina short reads

Bird, James

Development of an analysis pipeline for HLA genotyping using illumina short reads

dc.contributor.author	Bird, James
dc.date.accessioned	2019-09-10T13:03:05Z
dc.date.available	2019-09-10T13:03:05Z
dc.date.issued	2019
dc.description	A dissertation submitted to the Faculty of Science, University of the Witwatersrand, in fulfillment of the requirements for the degree of Master of Science. Johannesburg March, 2019	en_ZA
dc.description.abstract	Human leukocyte antigens are highly polymorphic loci located on chromosome six. This region is the most polymorphic region within the human genome, and as such, genotyping alleles in this region is problematic. Furthermore, the required resolution of genotyping is dependent on the application. For instance, organ transplants require two-digit resolution for kidney, and a minimum of four-digit resolution for bone marrow, while population disease related studies often require six-digit resolution. As specialized HLA genotyping tools have been developed which utilize NGS data, the aim of this study was to compare four HLA genotyping tools, namely - BWAkit, xHLA, Kourami and HISAT-Genotype, and to evaluate whether population-speciﬁc HLA variability would aﬀect their accuracy. The accuracy of the tools were compared to Sanger sequenced HLA data, where exons 2 and 3 were sequenced for HLA class I. As exons 2 and 3 were available as a reference from the Sanger sequencing, an accurate allele call was determined on its similarity to the reference data. It was found that at the two- and four-digit resolution, xHLA was the most accurate, which was due to the inclusion of a nucleotide-to-protein alignment step in the algorithm. Kourami was the most accurate at the six-digit resolution due to the use of alternate loci, in the alignment step. To further identify possible error trends, the allele sequences produced by the tools were analyzed. It was found that the majority of errors occurred at heterozygous positions, where false homozygous positions were identiﬁed. It was also noted that, with the exception of HISAT-Genotype, each tool was most accurate at HLA-B, and least accurate at HLA-C. From evaluating HLA population-speciﬁc variability, it was found that the four super-populations tested African, Asian, European and South American, did not signiﬁcantly vary, in regards to HLA variability. It was, however, found that the diﬀerent loci diﬀered signiﬁcantly from each other. Therefore, in conclusion, future improvements include varying the parameters when genotyping diﬀerent loci. Currently, however, a consensus approach using xHLA and Kourami should be utilized	en_ZA
dc.description.librarian	MT 2019	en_ZA
dc.format.extent	Online resource (189 leaves)
dc.identifier.citation	Bird, James Andrew (2019) Development of an analysis pipeline for HLA genotyping using illumina short reads, University of the Witwatersrand, Johannesburg, <http://hdl.handle.net/10539/28076>
dc.identifier.uri	https://hdl.handle.net/10539/28076
dc.language.iso	en	en_ZA
dc.subject.lcsh	LA histocompatibility antigens
dc.subject.lcsh	Immunotherapy
dc.title	Development of an analysis pipeline for HLA genotyping using illumina short reads	en_ZA
dc.type	Thesis	en_ZA

Files

Original bundle

Now showing 1 - 1 of 1

Name:: James_Bird_Dissertation.pdf
Size:: 4.57 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

ETD Collection