Mariya V. SivaySarah E. HudelsonJing WangYaw AgyeiErica L. HamiltonAmanda SelinAnn DennisKathleen KahnF. Xavier Gomez-OliveCatherine MacPhailJames P. HughesAudrey PettiforSusan H. EshlemanMary Kathryn Grabowski2023-09-122023-09-122018-07-05http://hdl.handle.net/10539/35879Background South Africa has one of the highest rates of HIV-1 (HIV) infection world-wide, with the highest rates among young women. We analyzed the molecular epidemiology and evolutionary history of HIV in young women attending high school in rural South Africa. Methods Samples were obtained from the HPTN 068 randomized controlled trial, which evaluated the effect of cash transfers for school attendance on HIV incidence in women aged 13–20 years (Mpumalanga province, 2011–2015). Plasma samples from HIV-infected participants were analyzed using the ViroSeq HIV-1 Genotyping assay. Phylogenetic analysis was performed using 200 pol gene study sequences and 2,294 subtype C reference sequences from South Africa. Transmission clusters were identified using Cluster Picker and HIV-TRACE, and were characterized using demographic and other epidemiological data. Phylodynamic analyses were performed using the BEAST software. Results The study enrolled 2,533 young women who were followed through their expected high school graduation date (main study); some participants had a post-study assessment (follow-up study). Two-hundred-twelve of 2,533 enrolled young women had HIV infection. HIV pol sequences were obtained for 94% (n = 201/212) of the HIV-infected participants. All but one of the sequences were HIV-1 subtype C; the non-C subtype sequence was excluded from further analysis. Median pairwise genetic distance between the subtype C sequences was 6.4% (IQR: 5.6–7.2). Overall, 26% of study sequences fell into 21 phylogenetic clusters with 2–6 women per cluster. Thirteen (62%) clusters included women who were HIV-infected at enrollment. Clustering was not associated with study arm, demographic or other epidemiological factors. The estimated date of origin of HIV subtype C in the study population was 1958 (95% highest posterior density [HPD]: 1931–1980), and the median estimated substitution rate among study pol sequences was 1.98x10-3 (95% HPD: 1.15x10-3–2.81x10-3) per site per year. Conclusions Phylogenetic analysis suggests that multiple HIV subtype C sublineages circulate among school age girls in South Africa. There were no substantive differences in the molecular epidemiology of HIV between control and intervention arms in the HPTN 068 trial.enHIV-1 diversity among young women in rural South Africa: HPTN 068Article