Expression and Methylation of Peroxidasin in Breast Cancer Cell Lines by Jemma Lilian Falkov (1611672) Dissertation Submitted in fulfilment of the requirements for the degree Master of Science in Molecular and Cell Biology in the Faculty of Science, University of the Witwatersrand, Johannesburg, South Africa Supervisor: Demetra Mavri-Damelin August 2023 i Declaration I, Jemma Lilian Falkov (1611672), am a student registered for the degree of Master of Science in the academic year 2021. I hereby declare the following: • I am aware that plagiarism (the use of someone else’s work without their permission and/or without acknowledging the original source) is wrong. • I confirm that the research dissertation submitted for assessment for the above degree is my own unaided work except where explicitly indicated otherwise and acknowledged. • I have followed the required conventions in referencing the thoughts and ideas of others. • I understand that the University of the Witwatersrand may take disciplinary action against me if there is a belief that this is not my own unaided work or that I have failed to acknowledge the source of the ideas or words in my writing. Signature ___________ ____ 4 August 2023 ii Abstract Peroxidasin (PXDN) is a haem-containing extracellular matrix peroxidase protein which forms hypohalous acids in the presence of hydrogen peroxide (H2O2). The predominant role of PXDN is that of a collagen IV crosslinker within the basement membrane. Increased collagen IV deposition has been linked to tissue invasion and metastasis in breast cancer and PXDN has also been shown to assist in the process of epithelial-mesenchymal transition (EMT) in cancer. Various cancer types display dysregulated levels of PXDN expression including breast cancer and this dysregulation has been associated with poor prognosis. This study aimed to investigate whether DNA methylation of the PXDN promoter may be a mechanism through which changes in PXDN expression observed in breast cancer are regulated. Non-invasive MCF-7 and invasive MDA-MB-231 cells were used as models for luminal A and triple negative breast cancer (TNBC) respectively. The HEK-293 cell line was used as a non-cancerous control cell line. DNA methylation levels of the PXDN promoter and PXDN protein expression was investigated in these cell lines through the methods of methylation sensitive PCR (MS PCR) and immunofluorescence microscopy. Relative levels of PXDN expression were determined through immunofluorescence microscopy. Corrected total cell fluorescence (CTCF) analysis of these images revealed the highest PXDN levels to be found within the invasive MDA-MB-231 cell line, which was double that of the MCF- 7 cell line. All cell lines were treated with 10 nM β-Oestradiol, which caused an increase in PXDN expression within the MCF-7 and HEK-293 cell lines and a decrease in expression within the MDA-MB-231 cell line to half its untreated value. PXDN was found to be localised in the ECM in all three cell lines. To elucidate the role of DNA methylation, methylation sensitive PCR (MS PCR) was performed on all three cell lines, with four primer pairs spanning a region of 1305 base pairs (bp) within the PXDN promoter. A region of differential methylation was found between the MDA-MB-231 and HEK-293 cell lines between 524 bp and 53 bp upstream of the transcription start site (TSS). This region was unmethylated within the MDA-MB-231 cell line and methylated within the HEK-293 cell line, which correlates with expression differences between these two cell lines and suggests this region could be of regulatory significance. The four primer pairs designed to amplify the iii PXDN promoter were unable to amplify this region within the MCF-7 cell line. A heterochromatic DNA conformation or a point mutation increasing CpG content creating a thermodynamically ultra-fastened (TUF) region could be the explanation behind this phenomenon, however further research is required to elucidate the mechanism responsible. In conclusion, PXDN shows higher expression in TNBC cells than in luminal A subtype cells. The oestrogen receptor is involved in regulating PXDN expression, however, different mechanisms seem to be at play between the two cell lines. The contribution of CpG methylation to this change in PXDN expression remains unknown, as does the nature of the interaction between the oestrogen receptors and the gene. Further research is required to clarify the mechanisms involved. iv Acknowledgments I would first like thank my supervisor Professor Demetra Mavri-Damelin, this project was her idea and design. Apart from her wealth of knowledge and wisdom, some of which she has imparted on me, her patience and kindness have gently allowed me to discover and refine my own abilities as a young scientist. I owe considerable gratitude to Dr Tebogo Marutha for all his assistance when I was starting out in the laboratory. His quiet diligence and ever-present willingness to help, created an inclusive culture where selflessness and humility were the defining attributes of his mentorship. I also would like to thank Thokozile, Mistral and Jamie for being my functional genetics laboratory family. I am grateful to the Microscopy and Microanalysis Unit for all their assistance. In particular I would like to thank Dr Deran Reddy for his guidance with regard to the image capturing and analysis component of my research. Thank you to the NRF for funding this research. I am grateful for having received the opportunity to further my studies within the school of Molecular and Cell Biology. I would like to acknowledge my family, for all their love and support. Thank you to my parents for their selfless investment in my education which has given me every opportunity. Lastly thank you to Chris for all the love, motivation and coffee. v Table of Contents Declaration ....................................................................................................................................... i Abstract ........................................................................................................................................... ii Acknowledgments .......................................................................................................................... iv Table of Contents ............................................................................................................................ v List of Figures ..............................................................................................................................viii List of Tables ................................................................................................................................... x List of Abbreviations ...................................................................................................................... xi 1 Introduction .................................................................................................................................. 1 1.1 Breast cancer ............................................................................................................................. 2 1.2 The ECM: A Master Regulator of Cancer Progression............................................................. 5 1.3 PXDN: a Regulator of Cellular Adhesion ................................................................................. 8 1.4 PXDN, Oxidative Stress and Signalling Pathways Associated with Cancer .......................... 10 1.5 Dysregulated PXDN Expression in Cancer ............................................................................. 11 1.6 DNA Methylation and Cancer................................................................................................. 14 1.7 PXDN Promoter Methylation: A Potential Prognostic Marker? ............................................. 19 1.8 Laboratory Methods for Analysing DNA Methylation ........................................................... 20 1.9 Aims and Objectives ............................................................................................................... 23 2 Methods and Materials ............................................................................................................... 25 2.1 Immunofluorescence Microscopy ........................................................................................... 25 2.1.1 Cell Culture .......................................................................................................................... 25 2.1.2 Fixing and Permeabilization of cells .................................................................................... 26 2.1.3 Immunostaining: Primary Antibody..................................................................................... 26 vi 2.1.4 Immunostaining: Secondary Antibody and DAPI ............................................................... 26 2.1.5 Mounting and Visualisation ................................................................................................. 27 2.1.6 Image Analysis and Expression Quantification ................................................................... 27 2.1.7 Statistical Analysis ............................................................................................................... 27 2.2 Methylation Sensitive PCR ..................................................................................................... 28 2.2.1 Analysing the CG% of the PXDN Promoter and Restriction Enzyme Recognition Sites ... 28 2.2.2 DNA Extraction.................................................................................................................... 31 2.2.3 Restriction Digest ................................................................................................................. 32 2.2.4 PCR ...................................................................................................................................... 33 2.2.5 Agarose Gel Electrophoresis ................................................................................................ 35 2.2.6 Interpreting the MS PCR Results ......................................................................................... 36 3 Results ........................................................................................................................................ 38 3.1 Immunofluorescence Microscopy ........................................................................................... 38 3.2 Methylation Sensitive PCR ..................................................................................................... 44 4 Discussion .................................................................................................................................. 51 5 Conclusion and Future Prospects ............................................................................................... 58 6 References .................................................................................................................................. 60 7 Appendix .................................................................................................................................... 75 7.1 Recipes .................................................................................................................................... 75 7.1.1 0.5 M EDTA ......................................................................................................................... 75 7.1.2 50 x TAE Buffer ................................................................................................................... 75 7.1.3 10 x PBS buffer .................................................................................................................... 76 7.2 DNA Extraction Results .......................................................................................................... 77 7.2.1 Phenol Chloroform-extracted gDNA ................................................................................... 77 vii viii List of Figures Fig. 1 Global breast cancer mortality rates. ......................................................................... 2 Fig. 2 The remodelling of the ECM takes place during cancer progression. ...................... 8 Fig. 3 The NC1 collagen IV hexamer composed of collagen monomers and dimers. ........ 9 Fig. 4 Normalised PXDN expression data from the Human Protein Atlas. ...................... 12 Fig. 5 PXDN expression in 17 cancer types determined by RNA-Sep data from TCGA project. ............................................................................................................................... 13 Fig. 6 High PXDN expression is associated with unfavourable prognosis in endometrial, cervical and stomach cancer. ............................................................................................. 14 Fig. 7 The GC% of the PXDN promoter. .......................................................................... 29 Fig. 8 The locations of CCGG sites with respect to primer pairs and the TSS of the PXDN promoter. ........................................................................................................................... 30 Fig. 9 Methylation sensitive PCR agarose gel electrophoresis layout of amplified methylated and unmethylated templates. .......................................................................... 36 Fig. 10 PXDN expression in MDA-MB-231 cells. ........................................................... 39 Fig. 11 PXDN expression in MCF-7 cells......................................................................... 40 Fig. 12 PXDN expression in HEK_293 cells. ................................................................... 41 Fig. 13 PXDN expression levels in MCF-7, MDA-MB-231 and HEK-293 cell lines. ..... 44 Fig. 14 Agarose electrophoresis gel and spectrophotometry readings for MCF-7 phenol chloroform-extracted gDNA. ............................................................................................ 45 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473008 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473009 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473010 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473011 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473012 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473012 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473013 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473013 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473014 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473015 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473015 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473017 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473018 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473019 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473020 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473021 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473021 ix Fig. 15 Methylation sensitive PCR in the MDA-MB-231 cell line. .................................. 46 Fig. 16 Methylation sensitive PCR in the HEK-293 cell line. .......................................... 47 Fig. 17 Troubleshooting difficulties amplifying the PXDN promoter in DNA extracted from MCF-7 cells. ............................................................................................................. 49 Fig. 18 Further troubleshooting PXDN promoter amplification difficulties in the MCF-7 cell line. ............................................................................................................................. 50 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473022 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473023 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473024 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473024 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473025 file:///C:/Users/jemmafalkov/Desktop/Masters/Dissertation/Final%20Submission/Falkov%20Dissertation%20final%20submission.docx%23_Toc137473025 x List of Tables Table 1: The four subtypes of breast cancer based on hormone and growth factor receptors. ....... 3 Table 2: Components and volumes of MspI/HpaII restriction digest reactions ............................ 33 Table 3: PCR reaction components and volumes for amplification of regions within the PXDN promoter ........................................................................................................................................ 34 Table 4: Primers designed for the amplification of regions within the PXDN promoter .............. 34 Table 5: The amplicons within the PXDN promoter and their parameters ................................... 34 Table 6: PCR reaction conditions.................................................................................................. 35 Table 7: Primer pair designed to amplify the TP53 promoter....................................................... 35 xi List of Abbreviations AI - Aromatase inhibitor bp - Base pairs BSA - Bovine serum albumen BMI - Body mass index BRCA1 - Breast cancer gene 1 BRCA2 - Breast cancer gene 2 CTCF - Corrected total cell fluorescence CpG - 5'-Cytosine-phosphate-Guanine-3' DAPI - 4', 6-Diamidino-2-phenylindole dihydrochloride ddH20 - double distilled water DMEM - Dulbecco's Minimum Essential Medium DNA - Deoxyribonucleic acid DNMT - DNA methyltransferase ECM - Extracellular matrix EDTA - Ethylenediaminetetraacetic acid EMT - Epithelial-mesenchymal transition ER - Oestrogen receptor FISH - Fluorescence in situ hybridisation FITC - Fluorescein isothiocyanate H2O2 - Hydrogen peroxide HER2 - Human epidermal growth factor receptor 2 HOBr - Hypobromous acid HOCl - Hypochlorous acid xii HRP - Horse radish peroxidase MWM - Molecular weight marker MAPK - Mitogen-activated protein kinase NADPH - Nicotinamide adenine dinucleotide phosphate Nox - NADPH-oxidases Nrf2 - Nuclear factor erythroid 2-related factor 2 nTPM - Normalised transcript per million PBS - Phosphate buffered saline PCR - Polymerase chain reaction PI3K - Phosphoinositide 3-kinase PR - Progesterone receptor PXDN - Peroxidasin Redox - Reduction-oxidation RNA - Ribonucleic acid ROS - Reactive oxygen species TAE - Tris acetate EDTA TGF-β - Transforming growth factor-β TSG - Tumour suppressor gene TUF - Thermodynamically ultra-fastened UV - Ultraviolet 1 1 Introduction In 2018, 18.1 million new cancer cases and 9.6 million cancer-related deaths were reported worldwide across all age groups (Bray et al., 2018). Risk stratification and prognostic markers play a significant role in the improvement of patient survival, allowing for a better match between patient and treatment plan (Selleck et al., 2017). As a highly heterogenic disease, there are various pathways involved in the development of cancer and the molecular fingerprint of the disease differs vastly between individuals and cancer types. This variation makes the development of efficacious treatment plans with minimal toxic effects a challenging task (Semenza, 2007). The past 10 years have shown a significant growth in the field of personalised medicine, specifically within the realm of cancer treatment and patient risk stratification. Promoter DNA methylation- based biomarkers have become more prominent candidates within the field of prognostic marker research and identification (Koch et al., 2018). Peroxidasin (PXDN) is an extracellular matrix (ECM) peroxidase protein that forms hypohalous acids for various functions (Cheng et al., 2008). This protein has been linked to cancer through various pathways and processes including the phosphoinositide 3-kinase (PI3K/AKT) pathway (Zheng and Liang, 2018); its interaction with reactive oxygen species (ROS) (Dougan et al., 2019); and, perhaps most noteworthy, tissue invasion and metastasis through the process of epithelial- mesenchymal transition (EMT) (Tauber et al., 2010). PXDN displays dysregulated levels of expression in various cancer types including ovarian, prostate, bladder, oesophageal and breast cancer (Cai et al., 2018; Di et al., 2019; Dougan et al., 2019; Sigurdardottir et al., 2021; Zheng and Liang, 2018). Furthermore, dysregulated PXDN expression has also been associated with unfavourable prognoses (Zhou et al., 2022). The mechanisms that underlie changes in PXDN expression remain unknown. This study aimed to propose DNA methylation of the PXDN promoter as a mechanism responsible for changes in PXDN expression observed in breast cancer by evaluating whether this form of epigenetic regulation modifies the gene and the subsequent effects this has on protein expression. Changes to DNA methylation levels of the PXDN promoter within breast cancer cells could potentially act as a prognostic marker for the purpose of risk stratification in breast cancer patients. 2 1.1 Breast cancer Breast cancer is the most prevalent form of the disease amongst women; composing 24.5% of female cases and responsible for 15.5% of female cancer-related deaths (Sung et al., 2021). Of all cancers diagnosed worldwide across age groups, female breast cancer incidence (11.7%) is highest, followed by lung (11.4%), prostate (7.3%) and nonmelanoma of skin cancer (6.2%) (Sung et al., 2021). Female breast cancer mortality accounts for 6.6% of all cancer-related deaths, fifth after lung (18%), colorectal (9.4%), liver (8.3%) and stomach (7.7%) cancer (Sung et al., 2021). Breast cancer mortality rates are highest in developing countries (Figure 1). Shulman et al. (2010) suggest that the reason for this trend is that screening technologies such as mammography and ultrasonography require expensive equipment which is not as readily available in low-income countries. Therefore, breast cancer is generally only identified at advanced stages when prognosis is poor. Non-genetic factors which impact the risk for breast cancer development include body mass index (BMI), breast density and age (Engmann et al., 2017). Genetic risk factors include mutations in genes which repair DNA damage such as breast cancer gene 1 and 2 (BRCA1 and BRCA2), checkpoint kinase 2 (CHEK2) and ataxia-telangiectasia mutated (ATM) gene; as well as those which regulate cell growth such as tumour protein p53 (TP53) and phosphatase and tensin homolog (PTEN) (Chavarri-Guerra et al., 2017). Breast cancer is highly heterogeneous and can present a multitude of molecular fingerprints. Histopathologic classification of breast cancer has resulted in the creation of four distinct categories (Koboldt et al., 2012): luminal A, luminal B, basal-like and human epidermal growth factor receptor 2 (HER2) enriched (Table 1). Generally, Estimated age-standardized mortality rates (World) in 2018, breast, all ages < 10.6 10.6–13.3 13.3–15.9 15.9–18.5 ≥ 18.5 No data Not applicable ASR (World) per 100 000 All rights reserved. The designations employed and the presentation of the material in this publication do not imply the expression of any opinion whatsoever on the part of the World Health Organization / International Agency for Research on Cancer concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. Dotted and dashed lines on maps represent approximate borderlines for which there may not yet be full agreement. Data source: GLOBOCAN 2018 Graph production: IARC (http://gco.iarc.fr/today) World Health Organization © International Agency for Research on Cancer 2018 Fig. 1 Global breast cancer mortality rates. The highest breast cancer mortality rates are in less economically developed countries, particularly Central and Eastern Africa, Eastern Europe and Southern Asia. (Bray et al., 2018). 3 immunohistochemical assays and fluorescence in situ hybridisation (FISH) are used to determine breast cancer subtype (Harris, 2018). Table 1: The four subtypes of breast cancer based on hormone and growth factor receptors. * 16.4–20.8% of tumours display HER2 overexpression ** (Sørlie et al., 2001) Luminal A and B subtypes show elevated expression of hormone-regulated pathways, particularly proliferation pathways regulated by the oestrogen receptor ⍺ (ER⍺), which shows elevated levels of expression in both subtypes (Prat et al., 2015). This means that these subtypes are generally responsive to endocrine therapy such as the suppression of oestrogen production with aromatase inhibitors (AIs), which prevent cell proliferation triggered by the various ER pathways. Luminal A tumours have the best prognosis and can be solely treated with endocrine therapy if they are smaller than 1 cm in diameter. Luminal B tumours can be distinguished from luminal A by lower expression of the progesterone receptor (PR), a higher percentage of TP53 mutations (Koboldt et al., 2012), and higher expression of the proliferation marker Ki67 (Fragomeni et al., 2018). Higher Ki67 expression results in luminal B patients being more likely to develop resistance to AIs. To combat this effect AI treatment is often combined with a PI3K/AKT/mammalian target of rapamycin (mTOR) pathway inhibitor such as everolimus (Pritchard et al., 2013). About 20% of Subtype Cellular source of cancer % of breast cancers that are this type ER⍺ status PR status HER2 status Severity (grade, survival probability**, prognosis) Treatment types Luminal A Invasive Ductal Carcinoma 30 – 40% Positive Positive Negative Low, 0.9, Favourable Endocrine (AI) Advanced Cases: CDK4/6 or PI3K/AKT.mTOR inhibitors Luminal B Invasive Ductal Carcinoma 20 – 30% Positive Positive Mostly negative* Higher than Luminal A but still low, 0.5, less favourable Endocrine (AI) and CDK4/6 or PI3K/AKT.mTOR inhibitors Basal-like Invasive Ductal Carcinoma 15 – 20% Negative Negative Negative High, 0.1, Poorest PARP inhibitor chemotherapy HER2+ Invasive Ductal Carcinoma 12 – 20% Negative Negative Positive (overexpressed) High, 0.3, Poor HER2 inhibitor and HER2 dimerization inhibitor 4 luminal B tumours have been shown to present with HER2 overexpression (Prat et al., 2015). HER2 is a regulator of transcription and enhances protein synthesis and cell growth. HER2- enriched tumours show overexpression of HER2 and the highest number of mutations out of all the breast cancer subtypes (Prat et al., 2015). HER2-enriched tumours tend to have mutations in the tumour-suppressor genes TP53 and PI3KCA (Koboldt et al., 2012). The HER2 receptor has both an intracellular tyrosine kinase domain and an extracellular domain. Drugs such as lapatinib bind to the intracellular tyrosine kinase domain, thereby inhibiting HER2 signalling pathways. Trastuzumab blocks HER2 functionality by binding to the extracellular domain of the receptor. Both intracellular and extracellular HER2 inhibitors have been shown to be effective in preventing disease progression (Cameron et al., 2008). Drug resistance, especially in more advanced cases of HER2+ breast cancer, has been addressed by combining HER2 inhibitors with the HER2 dimerization inhibitor pertuzumab (Swain et al., 2015). Good outcomes are generally observed when trastuzumab and pertuzumab are used in conjunction for the treatment of HER2+ breast cancer (Fragomeni et al., 2018). Triple negative breast cancer (TNBC) is another term used to describe the basal-like molecular subtype which has high expression of proliferative genes associated with the skin basal layer. TNBC tumours generally exhibit low expression of ER⍺, PR and HER2 and 80% of these tumours are TP53-mutated (Prat et al., 2015). This subtype is particularly heterogenous and has therefore been further categorised based on gene expression profiles: basal-like 1 and 2 (BL1 and BL2), mesenchymal (M), immunomodulatory (IM), luminal androgen receptor (LAR) and mesenchymal stem like (MSL) (Lehmann et al., 2011). TNBC has a higher likelihood of recurring and generally patients have shorter overall survival compared to other types of breast cancer (Blows et al., 2010). The variability that exists within the TNBC subtype makes it difficult to treat, unlike the other subtypes it is unresponsive to endocrine therapy and trastuzumab (Bettaieb et al., 2017). Although chemotherapeutic approaches can be used in the other subtypes of breast cancer, the margin of improvement between this approach and an endocrine and trastuzumab approach is minimal which allows chemotherapy to be avoided (Bettaieb et al., 2017). In TNBC, chemotherapy is the most effective form of treatment, however targeted therapies for TNBC do exist. One example is polyadenosine diphosphate-ribose polymerase (PARP) inhibitors such as Olaparib (Bettaieb et al., 2017). Olaparib or iniparib are generally used in conjunction with chemotherapeutic drugs which induce DNA cross-linking such as carboplatin (O’Shaughnessy et al., 2014). PARP is responsible 5 for repairing DNA damage and therefore inhibiting its action allows for DNA damage caused by chemotherapeutic drugs in cancer cells to not be reversed. The chemotherapeutic approach to TNBC involves the combination of docetaxel which inhibits microtubule dimerization and therefore prevents mitosis from occurring; doxorubicin which is an intercalating agent; and cyclophosphamide which acts as a DNA cross-linker (Isakoff, 2010). The heterogeneous nature of breast cancer and particularly TNBC means that there is extensive variation above and beyond the four histopathological divisions that have been discussed. This variation is observable through genomic and epigenomic signatures and is worth noting when it comes to treatment decisions and selecting the correct patients for clinical trials. For example Lehmann et al. (2016) showed that the further subtyping of TNBC assisted in the prediction of neoadjuvant chemotherapy efficacy in patients. As discussed previously, DNA methylation is dysregulated in cancer. Drugs such as DNA methyltransferase inhibitors (DNMTi) which prevent the hypermethylation of tumour suppressor genes (TSGs) have been effective in improving patient survival in TNBC. Stirzaker et al. (2015) found methylation clusters with prognostic value in TNBC. One of these methylation clusters was gene body methylation within the Wilms tumour 1 (WT1) gene, which correlated with elevated gene expression and poor survival whereas hypermethylation of the promoter of the same gene was associated with low levels of expression and improved patient survival. Therefore, DNA methylation-based biomarkers are worth investigating within the realm of breast cancer research as they hold the potential to improve risk stratification and the analysis of the efficacy of treatment plans. 1.2 The ECM: A Master Regulator of Cancer Progression The ECM is the non-cellular component of tissue and performs major roles in cellular support as well as various regulatory biochemical and mechanical processes including growth, proliferation, migration and differentiation (Frantz et al., 2010; Walker et al., 2018). These processes are regulated through biochemical and even mechanical signals: mechanotransduction is the initiation of signalling cascades through tensional, compression and shearing forces (Lampi and Reinhart- King, 2018). ECMs are composed of various components including collagens, elastin, proteoglycans (PGs), fibronectin, laminins, glycoproteins and hyaluronan, which are arranged in various configurations depending on tissue function (Theocharis et al., 2019). The ECM, rather than acting as a fixed structure, is a fluid environment which constantly fluctuates in response to 6 a multitude of signals, mechanical or chemical (Eble and Niland, 2019; Walker et al., 2018). There are two broad categories of ECM: interstitial matrices are found between and around cells whereas pericellular matrices are in close contact with cells (Theocharis et al., 2016). As mentioned above, the ECM is composed of various fibrous proteins and PGs, which define its structural and chemical properties and differ in ratio and organisation depending on the tissue type. Of these components, collagen is the most abundant superfamily, making up roughly 30% of the protein within mammalian tissue types and performing a host of structural, organisational and mechanical functions (Eble and Niland, 2019; Theocharis et al., 2016). There are 28 different types of collagen (Gordon and Hahn, 2010), the defining feature of which is a distinct triple-helical domain formed by the combination of three ⍺ chains. The combination of these ⍺ chains results in either the formation of homo- or heterotrimers that make up the helical structure (Ricard-Blum, 2011). The sequence which defines the right-handed helical structure formed by these ⍺ chains is the Gly-X-Y sequence where X and Y are predominantly taken up by proline and hydroxyproline amino acids respectively, preceded by a glycine amino acid (Ricard-Blum, 2011; Theocharis et al., 2016). There are seven categories making up the collagen protein superfamily; two of these are the fibrillar and network-forming categories. Fibrillar collagens, such as collagen I, are found in connective tissues such as skin, bone, cartilage and cornea (Muiznieks and Keeley, 2013). Collagen I is the most abundant fibrillar collagen and is involved in processes such as wound repair and organ development (Walker et al., 2018). Network forming collagens form the basal lamina of basement membranes with collagen IV as the most common example (Ricard-Blum, 2011). Basement membranes are a type of pericellular ECM that form a thin barrier between parenchyma and stroma (Eble and Niland, 2019; Theocharis et al., 2016). The main component of basement membranes is a stabilising collagen IV network (Theocharis et al., 2016). Collagen IV molecules form hexamers by binding to one another’s NC1 domain at the C-terminal, this network is further strengthened by other proteins such as perlecan and laminin (Ricard-Blum, 2011). The basement membrane is responsible for anchoring cells and enforcing cellular adhesion, impairment of this structure is associated with metastasis in cancer (Chang et al., 2017; Eble and Niland, 2019). Metastasis is responsible for most cases of cancer mortality. The movement of metastatic cancer cells from the primary tumour site through the barriers of basement membranes and connective tissue to other parts of the body, requires motility of malignant cells (Theocharis et al., 2016). This movement also requires the alteration in the structure of the ECM itself in a tumour-permissive 7 way (Kalluri, 2016). Cancer cells must attain a motile phenotype in order to engage in the process of metastasis, this entails the growth of actin extracellular obtrusions which are termed invadosomes (Chang et al., 2017). Invadosomes infiltrate the ECM and degrade it through the recruitment of lytic enzymes such as matrix metalloproteinases (MMPs), this allows infiltration of the basement membrane and results in metastasis (Chang et al., 2017). MMP-2 and -9 are specifically active during tumour invasion through the degradation of collagen IV and therefore of the basement membrane (Deryugina and Quigley, 2006). Paradoxically, although collagen acts as a barrier to cancer invasion (as seen through the necessity of MMPs to break down the basement membrane during metastasis), increased collagen deposition and cross-linking is actually associated with an increase in metastasis (Lampi and Reinhart-King, 2018; Levental et al., 2009; Walker et al., 2018). Levental et al. (2009) induced collagen crosslinking which caused ECM stiffening and promoted malignancy in breast cancer (Figure 2). As the most abundant ECM protein, collagen is responsible for changes in ECM stiffness: Lo et al. (2000) showed that an increase in ECM stiffness stimulates cell growth, survival, and migration. This increase in the deposition of collagen (as well as other fibrous ECM proteins) interferes with cellular adhesion, polarity and increases growth factor signalling (Walker et al., 2018). Collagen cross-linking is performed by enzymes such as lysyl-6-oxidase (LOX) which are produced by cancer-associated fibroblasts (CAFs) and increase collagen deposition thereby resulting in an increase in ECM density and interstitial pressure (Karagiannis et al., 2012). LOX expression is induced by TGF-β signalling or by hypoxia, both of which are associated with the progression of cancer (Postovit et al., 2008). It has been established that the ECM and its dysregulation are key components in the processes of tumorigenesis, invasion and metastasis. The stiffening of the ECM is caused by a multitude of aberrant cell signalling and structural processes. One of these is the increase in collagen deposition and crosslinking which 1) triggers signal transduction cascades resulting in cell proliferation and survival; and 2) allows for the breaching of the basement membrane and assists in cell motility. The combination of changes in cell morphology through processes such as EMT, and the characteristics of the basement membrane are both firmly linked to the composition and nature of the ECM. This review will now discuss how PXDN, a collagen cross-linker, could be involved in promoting tumour metastasis just as LOX, another collagen cross-linker, has been shown to promote metastasis by stimulating the above processes. 8 1.3 PXDN: a Regulator of Cellular Adhesion PXDN is a haem-containing peroxidase which has been shown to form hypohalous acids in the presence of hydrogen peroxide (H2O2) produced by nicotinamide adenine dinucleotide phosphate (NADPH) oxidases (NOX enzymes) in the electron transport chain (Cheng et al., 2008). PXDN is distinct from other peroxidase proteins in that it contains ECM domains; these include a leucine rich repeat domain, multiple immunoglobin I-set domains and a von Willebrand factor type C domain (El-Gebali et al., 2019). Leucine rich repeat domains have both polar and non-polar Fig. 2 The remodelling of the ECM takes place during cancer progression. 1) Cancer cells (in this case epithelial neoplastic cells) replicate and put the basement membrane under strain. 2) The basement membrane bulges under this strain. Enzymes such as LOX cause the linearisation of collagen (i.e. collagen cross-linking) which is produced by cancer-associated fibroblasts. 3) The integrity of the basement membrane is compromised (by MMPs produced by invading cells) and cancer cells migrate, aided in their migration along the cross-linked collagen. (Walker, Mojares and Del Río Hernández, 2018). MMPs – matrix metalloproteinases; LOX – lysyl-6-oxidase 9 components and have been shown to facilitate interactions between proteins and lipids as well as protein-protein interactions (Karaulanov et al., 2006). Immunoglobin domains are associated with the major histocompatibility complex recognised by T-cells for the adaptive immune system, indicating protein-ligand and protein-protein interactions (Chan et al., 2000). Limited research has been performed on the von Willebrand factor type C protein domains, however Zhang et al. (2007) found them to be associated with cellular processes such as migration, adhesion, and signalling. Bhave et al. (2012) showed that PXDN catalyses sulfilimine bond formation between lysine and methionine amino acids of the C-terminals of two collagen IV protomers, resulting in the formation of the collagen IV hexamer (Figure 3). Hypohalous acids such as hypobromous acid (HOBr) and hypochlorous acid (HOCl) are produced by PXDN as reaction intermediates during H2O2-mediated oxidation. These intermediates react with collagen IV molecules ultimately resulting in sulfilimine bond formation which stabilises basement membranes (Bhave et al., 2012). The NC1 collagen IV hexamer is composed of two collagen IV monomers and two collagen IV dimers. The protomers of these dimers are linked to one another by a sulfilimine bond (Figure 3). Mutations in the Drosophila PXDN orthologue and the C. elegans orthologue PXN-2 show similar effects on the basement membrane to mutations in collagen IV (Bhave et al., 2012; Gotenstein et al., 2010). This indicates that the crosslinks formed in collagen IV by PXDN are essential to stabilising and ensuring the integrity of the basement membrane (Bhave et al., 2012). As discussed earlier, although basement membrane integrity is reinforced through the cross-linking of collagen IV, during tumorigenesis increased deposition and crosslinking is counterintuitively associated with tumour progression (Walker et al., 2018). Fig. 3 The NC1 collagen IV hexamer composed of collagen monomers and dimers. The collagen IV NC1 hexamer found in basement membranes dissociates into sulfilimine cross-linked dimers and uncross-linked monomers when exposed to SDS. (Bhave et al., 2012). 10 The role of PXDN in basement membrane stabilisation raises the question of whether its dysregulation could contribute to the increased mobility observed in cancer. EMT is a characteristic of cancer cells which allows for increased cell motility and therefore invasion and metastasis (Hay, 1995). This is because mesenchymal cells can migrate through the ECM, which epithelial cells are unable to do (Hay, 1995). Higher ROS levels have been shown to be present in cancer cells (Cruz-Bermúdez et al., 2019) and this increase in ROS has been linked to EMT. For example, high ROS levels activate the TGF-β signalling pathway (González-Ramos et al., 2012) which stimulates the activation of the mitogen-activated protein kinase (MAPK) pathway in epithelial cells, resulting in EMT (Rhyu et al., 2005). PXDN’s formation of the sulfilimine bond is facilitated by the ROS H2O2 and therefore the increase in ROS levels associated with EMT could also lead to increased PXDN crosslinking activity. Further, PXDN has been shown to be regulated by the master redox regulator nuclear factor erythroid 2-related factor 2 (Nrf2) (Hanmer and Mavri-Damelin, 2018). Changes in regular expression levels of PXDN have also been directly associated with EMT (Tauber et al., 2010). Cano et al. (2000) showed that the transcription factor Snai1 initiates EMT by 1) suppressing the expression of E-cadherin which is essential for the maintenance of cellular adhesion and 2) increases the expression of mesenchymal markers such as vimentin, fibronectin, and N-cadherin. Sitole and Mavri-Damelin (2018) showed that the Snai1 transcription factor regulates PXDN which suggests its involvement in EMT. PXDN is involved in the EMT process in various contexts during development including formation of the neural tube and muscle- epidermal attachment (Gotenstein et al., 2010; Sitole and Mavri-Damelin, 2018; Tindall et al., 2005). Therefore, PXDN is involved in EMT and increased expression in cancer can be correlated with increased invasive potential and metastasis. 1.4 PXDN, Oxidative Stress and Signalling Pathways Associated with Cancer High ROS levels cause damage to nucleic acids and cellular structures which triggers apoptosis (Redza-Dutordoir and Averill-Bates, 2016). Dougan et al. (2019) showed that PXDN regulated oxidative stress and ROS levels and linked these phenomena to a decrease in apoptosis in prostate cancer, proposing PXDN as a biomarker for the disease. They found increased phospho-p53 and Bcl-2 associated x-protein (Bax) expression as well as increased fluorescence in the apoptosis- detecting TUNEL assay following PXDN knockdown. Therefore, PXDN regulates apoptosis 11 through the metabolism of ROS, preventing the activation of apoptosis that an excess of ROS would usually induce. Other than the regulation of oxidative stress and the stimulation of EMT, PXDN has been linked to cancer through certain signalling pathways such as the PI3K/AKT pathway. The PI3K/AKT pathway results in the activation of various transcription factors including the mammalian target of rapamycin (mTOR) which promotes cell growth (LoRusso, 2016). mTOR and forkhead box transcription factor class O (FOXO) are both transcription factors which are activated by PI3K/AKT signalling and are responsible for transcribing genes associated with cell proliferation and survival respectively (LoRusso, 2016). Zheng and Liang (2018) found that PXDN knockdown resulted in a significant decrease in the expression of proteins related to the PI3K/AKT pathway in HEY ovarian human carcinoma cells. Examples of these proteins are phosphorylated AKT and PI3K which showed decreased levels of expression when siRNA was used to knockout PXDN. Therefore, through the PI3K/AKT pathway, PXDN knockdown resulted in a decrease in cell proliferation, migration and invasion in cervical cancer cells (Zheng and Liang, 2018). Levental et al. (2009) investigated the impact of increased collagen cross-linking on the process of breast tumorigenesis. They found that inducing collagen cross-linking increased the number of focal adhesions, increased PI3K signalling, caused ECM stiffening and initiated the process of invasion of an epithelium, and therefore confirms the link between the PI3K pathway and collagen cross- linking. 1.5 Dysregulated PXDN Expression in Cancer The PXDN gene is found on the reverse strand of chromosome 2 at position 2p25.3 and is approximately 100 kbp in length (Hunt et al., 2018) with a promoter of about 10100 bp (Kent et al., 2002). As previously discussed, Snai1 and Nrf2 are transcription factors of PXDN and have been shown to regulate EMT and redox processes respectively (Hanmer and Mavri-Damelin, 2018; Sitole and Mavri-Damelin, 2018). According to data compiled by the Human Protein Altas, PXDN shows particularly high expression in female tissues, of these tissues breast tissue shows the highest expression of PXDN (Figure 4). Sigurdardottir et al. (2021) conducted immunohistochemistry and reverse transcriptase polymerase chain reaction (RT-PCR) experiments and found that PXDN is expressed in epithelial cells, fibroblasts, and endothelial cells within the mammary gland. Therefore, PXDN shows 12 particularly high expression in breast tissue and dysregulation could contribute to cancer progression. PXDN shows altered expression levels in various types of cancer as shown in Figure 4, which was compiled from results of The Cancer Genome Atlas (TCGA) project. Figure 5 indicates fragments per kilobase of exon per million reads (FPKM) from RNA-seq data from the cancer genome atlas (TCGA) project (Uhlen et al., 2017). The highest level of PXDN expression in Figure 5 is within testis, breast and ovarian cancer which supports the hypothesis that it could be involved in disease progression. Indeed various studies have found elevated levels of PXDN expression in cancers including ovarian, prostate, bladder, breast and oesophageal cancer (Cai et al., 2018; Di et al., 2019; Dougan et al., 2019; Sigurdardottir et al., 2021; Zheng and Liang, 2018). A pan-cancer analysis of PXDN was conducted by Zhou et al. (2022); gene expression and clinical data from 33 tumours were downloaded from TCGA and compared to their normal constituents. PXDN expression was closely correlated to overall survival and acted as a risk factor in several cancer types including bladder urothelial carcinoma (BLCA), squamous cell carcinoma and stomach adenocarcinoma (STAD). Fig. 4 Normalised PXDN expression data from the Human Protein Atlas. PXDN shows highest expression in female tissues and of these breast tissue expresses the highest level of PXDN (Sigurdardottir et al., 2021). 13 Figure 6 shows PXDN as an unfavourable prognostic marker in endometrial, cervical and stomach cancer. Generally a gene is classified as an unfavourable prognostic gene if it has an FPKM value higher than one and patient survival is low. These Kaplan-Meier plots show high PXDN expression correlated with low levels of patient survival (Uhlen et al., 2017). Zhou et al. (2022) also found a significant correlation between later cancer stages and PXDN expression in various tumours including BLCA, colon adenocarcinoma (COAD), testicular germ cell tumours (TGCT), uterine corpus endometrial carcinoma (UCEC) and uveal melanoma (UVM). Supporting these findings, Dougan et al. (2019) and Zheng and Liang et al. (2018) found that increased PXDN expression correlated with a more advanced stage of cancer in prostate and ovarian cancer respectively. Fig. 5 PXDN expression in 17 cancer types determined by RNA-Sep data from TCGA project. Highest expression levels are observed in testis cancer (median: 18.7 FPKM), breast cancer (median: 16.4 FPKM) and ovarian cancer (median: 12.6 FPKM). Expression levels are determined by analysing RNA-Seq data from TCGA project, this graph was compiled by the protein atlas site (proteinatlas.org). 14 1.6 DNA Methylation and Cancer Epigenetics describes the inheritance of phenotypes caused by chemical modifications to DNA and histones which impact gene expression without altering DNA sequence. One of the ways in which epigenetic regulation takes place is through the modification of histones, for example, methylation, acetylation and phosphorylation (Klug et al., 2015). Histone methylation occurs on the lysine and arginine amino acid residues in histone proteins. The function and type of histone methylation depends on the location and context, for example, H3K9me3 modifications are seen in active genes but also have been associated with repressive effects within constitutive heterochromatin (Michalak et al., 2019). RNA molecules such as microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) are involved in gene expression and silencing, and these RNA molecules play key roles in gene regulation within the epigenome (Klug et al., 2015). One of the most well-studied forms of epigenetic modification is DNA methylation. Cytosine bases which precede guanine nucleotides (CpG sites) are targeted for methylation by DNA methyltransferases (DNMTs), which incorporate a methyl group to the fifth carbon of the cytosine base forming 5-methylcytosine (5mC) (Moore et al., 2013). Ten-eleven translocation (TET) methylcytosine dioxygenases execute demethylation by oxidising 5mC to 5-hydroxymethyl cytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC) (Ito et al., 2011). DNA methylation plays a crucial role in gene regulation and is involved in various processes including X-chromosome inactivation (Mohandas et al., 1981), imprinting (Reik et al., 1987) and the Fig. 6 High PXDN expression is associated with unfavourable prognosis in endometrial, cervical and stomach cancer. These Kaplan-Meier survival plots were determined using RNA- seq PXDN expression data and patient survival rates. An unfavourable prognostic marker is defined by an FPKM value greater than 1 correlated with decreased patient survival. This data was compiled by the Human Protein Atlas (proteinatlas.org). 15 maintenance of tissue-specific expression patterns. Within gene promoters, CpG methylation inhibits transcription by sterically hindering transcription factor binding and recruiting methyl- CpG-binding proteins which bind to methyl groups. These methyl-CpG-binding proteins interact with histone deacetylase to bring about a less open chromatin configuration and therefore further suppress expression (Zhu et al., 2016). However, where DNA methylation occurs outside gene promoters, it has become clear that the mechanisms behind this epigenetic alteration are more complex than was once believed and its effect can vary depending on location and context. DNA methylation originated in bacteria and is present in many eukaryotes (Zemach et al., 2010). Despite being an integral component of the genomes of many organisms, the 5mC modification is not found in some eukaryotes such as the model organism Drosophila melanogaster (Greenberg and Bourc’his, 2019). 5mC is spontaneously deaminated to thymine which causes G・ T mismatches, these mutations are frequently seen in tumours and this could explain why the 5mC epigenetic regulatory mechanism has been lost from certain evolutionary lineages (Holliday and Grigg, 1993). Indeed, mammals which do exhibit DNA methylation have a significantly lower CpG content than expected. This may be the reason why most CpG sites within the human genome are methylated, however between 20% and 30% are not methylated and form clusters generally around 1kb in size referred to as CpG islands (CGIs) (Gardiner-Garden and Frommer, 1987). The majority of CGIs can be found within gene regulatory elements (Li and Zhang, 2014) and are located in about two thirds of the promoters within the human genome, often unmethylated (Bird et al., 1985; Gardiner-Garden and Frommer, 1987). However, where 5mCs occur within gene bodies, they are mostly methylated and assist with transcriptional elongation and even influence splicing (Baylin and Jones, 2016; Jones, 2012; Kulis et al., 2012; Zafon et al., 2019). Therefore DNA methylation is not always associated with transcriptional repression; de Almeida et al. (2019) investigated a cohort of CpG sites and found that 206 CpG sites showed differences in methylation which correlated to changes in expression within 169 genes between normal and breast tumour tissue. Furthermore, there was a negative correlation between CpG methylation levels and gene expression for CpG sites within gene promoters whilst they identified a positive correlation between hypomethylated CpG sites and gene expression within gene bodies. There are certain patterns of cytosine methylation which are strongly associated with the process of tumorigenesis and the methylation of these cells can be distinguished from that of non- tumorigenic cells in three different ways (Baylin and Jones, 2016). The first of these is what has 16 been referred to as global hypomethylation; the majority of this hypomethylation associated with cancer occurs within transposable elements (TEs) which make up ~50% of the human genome. TEs generally have a high CpG content and are most commonly methylated; demethylation of TEs is associated with genomic instability and cancer (Bae et al., 2012; Kim et al., 2012; Wolff et al., 2010). The hypomethylation of the promoters of oncogenes constitutes a much smaller component of the global hypomethylation observed within cancer cells. For example the hypomethylation of SMYD3 (a gene encoding histone lysine methyltransferase) and the Erb-A1 oncogenes has been linked to colorectal cancer and lymphatic leukaemia respectively (Li et al., 2018; Lipsanen et al., 1988) and the hypomethylation of the thiosulfate sulfurtransferase-like domain containing 1 (TSTD1) gene was associated with poor overall survival and poor treatment response in breast cancer (Ansar et al., 2022). Another observation regarding DNA hypomethylation which is relevant to cancer is that CpGs within gene bodies seem to be hypomethylated in tumorigenic cells (Jones, 2012; Kulis et al., 2012). For example Kulis et al. (2012) found that hypomethylation of gene bodies and enhancers was a characteristic in the differentiation of B-cells in chronic lymphocytic leukaemia which means that DNA methylation plays a role beyond that of regulating transcription factor binding sites within gene promoters. Secondly, the hypermethylation of the CGIs within the promoters of TSGs has long been known to be associated with cancer progression. As discussed earlier, roughly 60% of gene promoters contain CGIs which are largely unmethylated. The cytosine methylation of the promoters of genes with tumour suppressor functions allow for carcinogenesis to take place uninhibited (Herman et al., 1995; Melki et al., 1999). The hypermethylation of the TSG BRCA1 is known to be involved in the progression of breast cancer (Esteller et al., 2000). Paydar et al. (2019) found that BRCA1 promoter hypermethylation is strongly associated with poor patient prognosis. Other studies have confirmed this (Gupta et al., 2014; Iwamoto et al., 2011) and the drug resveratrol was successfully used to prevent the hypermethylation of the BRCA1 promoter in MCF-7 cells (luminal A subtype breast cancer cells) (Papoutsis et al., 2012). Other TSGs which display promoter hypermethylation and a decrease in expression in breast cancer are ATM (Brennan et al., 2012; Flanagan et al., 2009), TMS1 (target for methylation-mediated silencing) (Parsons and Vertino, 2006), and the gene coding for E-cadherin (CDH1) (Shargh et al., 2014). Moelans et al. (2011) found that the promoters of paired box 6 (PAX6), BRCA2, paired box 5 (PAX5), WT1, Cadherin-13 (CDH13) and MutS homolog 6 (MSH6) were hypermethylated in both ductal carcinoma in situ (DCIS) and invasive ductal cancer (IDC) samples which correlate to early breast cancer. Therefore, the 17 hypermethylation of CGIs is highly relevant to breast cancer research, particularly where this phenomenon occurs within genes that carry out tumour suppressor functions. Thirdly, 5mCs are highly mutagenic and, where they occur within gene bodies, CpG sites are highly methylated and particularly concentrated within exons (Baylin and Jones, 2016; Jones, 2012). Studies have shown that within gene bodies, DNA methylation actually has a positive correlation with transcription and is highly conserved (Varley et al., 2013; Yang et al., 2014). The deamination of 5mC to thymine is prevalent in causing loss-of-function mutations within TSGs. For example, Rideout et al. (1990) showed that this phenomenon occurs within the tumour suppressor gene TP53 and Greenblatt et al. (1994) confirmed that more than 50% of the mutations within the same gene in colorectal cancer occurred at 5mC sites. In addition to 5mC deamination, 5mC has been shown to form carcinogenic adducts when exposed to benzo(a)pyrene (BaP) (Bukowska and Sicińska, 2021; Greenblatt et al., 1994). BaP is one of the carcinogenic compounds found within cigarette smoke and these adducts formed between the metabolites of BaP and 5mC have been linked to increased lung cancer mutations in smokers (Baylin and Jones, 2016; Bukowska and Sicińska, 2021; Greenblatt et al., 1994). The integral role that these changes in DNA methylation patterns play in the process of tumorigenesis have naturally lead to the investigation into drugs which have the ability to alter DNA methylation levels. DNMT inhibitors (DMTNi) such as azacytidine and decitabine cause DNA hypomethylation and have been approved for use during the treatment of acute myeloid leukaemia (AML) and myelodysplastic syndrome (MDS) (Wouters and Delwel, 2016). These same drugs have been investigated for the treatment of TNBC (Singh et al., 2021; Wong, 2021; Yu et al., 2018). DNMTi drugs have been most efficacious when used in conjunction with other forms of treatment, for example Mirza et al. (2010) found that the proliferation of TNBC cells MDA-MB-231 was inhibited using a DNMTi and paclitaxel, Adriamycin or 5-fluorouracil (common anticancer drugs used for the treatment of breast cancer). Another form of epigenetic treatment which is under investigation is a group of drugs which are known as histone deacetylase inhibitors (HDACi). Histone deacetylation is a histone modification which is often found in hypermethylated gene promoters whilst histone acetylation promotes a more open chromatin configuration and therefore increased gene expression. Therefore this group of drugs is worth noting when investigating the effect of TSG repression in cancer development. However, Connolly et al. (2021) studied the use of HDACi entinostat in endocrine-resistant breast cancer and found 18 that it did not improve patient survival in hormone receptor positive, HER2-negative advanced breast cancer. It is clear that changes in DNA methylation patterns are integral to the process of tumorigenesis. However, depending on the position of the CpG site, alterations in methylation level can have varying impacts on expression. Apart from opening up a realm of potential targets for cancer treatment, the study of DNA methylation has also raised the concept of methylation-based biomarkers. N-Myc downstream-regulated gene 4 (NDRG4), bone morphogenic protein (BMP) and Septin-9 (SEPT9) are all methylation-based biomarkers implemented clinically for the early detection of colorectal cancer (Koch et al., 2018). Short stature homeobox 2 (SHOX2) and Glutathione S-Transferase Pi 1 (GSTP1) are methylation-based biomarkers currently used for the early detection of lung cancer and the diagnosis and prognosis prediction of prostate cancer respectively (Koch et al., 2018). Similarly, methylation patterns have been used to screen patients for breast cancer and calculate risk of breast cancer development (Tang et al., 2016). Methylation-based markers for breast cancer have been investigated for the purpose of potentially improving patient prognosis by early disease detection, particularly in detecting recurrence of breast cancer for patients in remission (Tang et al., 2016). Case-control studies to investigate gene- specific hyper- or hypomethylation of promoter regions in breast cancer cases have been performed for BRCA1 and ATM. Therefore, particularly in the screening process, analysis of promoter methylation levels could help with risk stratification (Tang et al., 2016). de Almeida et al. (2019) performed a bioinformatic analysis on TCGA data and investigated 780 breast tumour samples and 83 normal tissue samples. They investigated CpG sites which were differentially methylated in normal and breast tumour tissue and found sites within zinc finger protein 154 (ZNF154) and homeobox D9 (HOXD9) which were hypermethylated within breast tumour tissue and correlated to poor prognosis. Stirzaker et al. (2015) analysed the prognostic value of CpG clusters in TNBC and identified 36 TNBC-specific hypermethylated regions occurring within gene bodies and promoters. They found that many of these hypermethylated sites specific to TNBC occurred within genes which code for transcription factors and zinc fingers, including ZNF154 and ZNF671. Both Stirzaker et al. (2015) and de Almeida et al. (2019) found that the majority of the genes that were hypermethylated within breast cancer tumours were transcription factors and homeobox genes and de Almeida et al. showed that hypomethylated regions coded for transmembrane proteins and immunoglobulins. 19 Apart from the three above-mentioned examples of histone and DNA methylation-regulated ECM genes which have been connected to the progression of various forms of cancer, Koch et al. (2018) identified 1800 unique DNA methylation-based cancer biomarkers. It is clear that DNA methylation in particular has been considered a promising field of research within the disciplines of cancer prognosis and diagnosis biomarkers. The question this study aims to address is whether the collagen cross-linker PXDN holds the potential to act as a clinical biomarker for breast cancer. 1.7 PXDN Promoter Methylation: A Potential Prognostic Marker? As discussed previously, PXDN expression is higher in various types of cancer and is an unfavourable prognostic marker. The collagen cross-linking role of PXDN is associated with matrix stiffening and EMT, therefore supporting the observed association of increased expression and poor prognosis. The mechanism behind increased PXDN expression levels remains unknown, however it is possible that increased expression is caused by hypomethylation of the CpG island within the PXDN promoter or hypermethylation of CpG sites within the PXDN gene body. If this is the case, it is possible that PXDN methylation levels could act as a biomarker. Whether or not demethylation of the PXDN promoter can account for increased expression levels of PXDN observed in breast cancer remains to be seen. This study aims to answer these questions and proposes PXDN promoter hypermethylation as a potential prognostic biomarker for breast cancer if DNA methylation is indeed the one of the causes of changes to PXDN expression levels observed in breast cancer in particular. As discussed in the previous section, breast cancer patient prognosis improves significantly in cases of early detection, therefore the continual discovery of new biomarkers which improve the screening process and risk determination, is crucial (Tang et al., 2016). If PXDN does prove to be a successful prognostic marker for breast cancer patients, it stands to act as a significant tool within the realm of breast cancer treatment programs. Koch et al. (2018) make a noteworthy point in their paper on DNA methylation biomarkers: the location of CpG binding sites within the promoter of genes is of great importance because some sites are more significant in the silencing of expression than others. This is an important point to keep in mind whilst considering this research question. 20 1.8 Laboratory Methods for Analysing DNA Methylation There are various methods which are used for the investigation of CpG methylation patterns in DNA. Factors such as cost, accuracy, sensitivity and time are important points of consideration when choosing a method for DNA methylation analysis. Another key consideration is the nature of the analysis to be performed; genome-wide verses site-specific DNA methylation patterns require distinct approaches. The gold standard of DNA methylation analysis is the method developed by Frommer et al. (1992) who discovered that sodium bisulfite treatment revealed a way to differentiate between C and 5mC at CpG sites. Taq DNA polymerase is unable to make this distinction and cannot incorporate methyl groups into the extending strand. Therefore the amplicons of standard PCR reactions lack the epigenetic markers of the original sequence. Sodium bisulfite treatment of DNA results in the deamination of unmethylated cytosine nucleotides. This allows for a distinction to be made between methylated and unmethylated cytosine bases, as the former will remain cytosine whereas the latter will have been converted to uracil by the deamination process. The bisulfite-modified DNA is then amplified through PCR and sequenced. Comparison of this sequence to the original pre-treated template sequence reveals the positions of methylated cytosines. Since the development of the bisulfite treatment approach for DNA methylation analysis, various PCR-based methods have been conceived which utilise this same principal. Two distinct approaches have emerged, the first group of methods involves primers that amplify the template bisulfite-modified DNA regardless of methylation status. An example of this approach is the bisulfite sequencing method designed used by Frommer et al (1992), as discussed previously, and high-resolution melting (Worm et al., 2001). High-resolution melting (HRM) can determine the methylation level of amplicons by analysing the melting curve of a PCR reaction with a bisulfite-converted DNA template (Worm et al., 2001). After bisulfite treatment and subsequent PCR, a methylated template will have a higher GC content than an unmethylated one. Therefore, these differences in base composition allow for a distinction to be made between unmethylated and methylated templates due to the stronger nature of the guanine/cytosine bond when compared to the adenine/thymine bond. This difference allows the methylated CpG content to be determined by the amount of energy required to denature the double stranded DNA (dsDNA). During the initial PCR reaction, a dsDNA-intercalating fluorescent dye 21 is incorporated and fluoresces when bound to dsDNA. The degree of fluorescence can be measured which allows the amplification process to be followed using a real-time PCR machine. PCR bias results in the preferential amplification of the unmethylated template due to differences in base composition and needs to be taken into consideration when performing this method (Warnecke et al., 1997). However, this hurdle can be addressed through primer design; Wojdacz and Hansen (2006) show that including a limited number of CpG sites within the primer allows for primer bias compensation, increasing amplification of the methylated sequence. The second group of methods involves the use of different sets of primers for methylated and unmethylated converted sequences. This primer design approach is used in one of the more basic forms of DNA methylation analysis: methylation-specific PCR (Herman et al., 1996). In methylation-specific PCR, DNA is amplified by two sets of primers following bisulfite treatment, with a distinct primer set for methylated DNA and one for unmethylated DNA. Depending on which primer set produces a band, as visualised by agarose gel electrophoresis, indicates the methylation state of the investigated region. Primer design can be a challenging component of PCR-based bisulfite methods for DNA methylation analysis. In methods where a single primer set needs to bind to both methylated and unmethylated bisulfite-modified DNA, high CpG content is problematic. This is because CpG dinucleotides will cause different sequences to be present in unmethylated and methylated DNA after bisulfite treatment. Therefore, it is recommended that no CpG sites are included in the primers where they can be avoided (Clark et al., 1994). This is highly problematic for regions of the genome such as gene promoter regions which are typically GC rich especially near transcription start sites. For example, the PXDN promoter contains a CpG island, which overlaps the transcription start site, and as such designing CpG-free primers for this region is not possible (see Figures 7 and 8). Therefore, methods such as methylation-specific PCR which have different sets of primers for methylated and unmethylated converted sequences, can be helpful because they make it possible to analyse these GC rich regions of the genome. However, it is important to note that a major problem with this approach is incomplete bisulfite conversion. Another difficulty with this method is recognising sequences which are partially methylated. Next generation sequencing-based methods exist for the analysis of DNA methylation. One example is bisulfite pyrosequencing, which involves the sequencing of bisulfite-modified DNA by measuring the luminescence caused by a luciferase reaction (York et al., 2012). The luciferase reaction is triggered by pyrophosphate release when a deoxynucleotide is incorporated during a 22 DNA synthesis reaction by DNA polymerase. Nanopore sequencing is a sequencing method which does not require prior bisulfite treatment. In this method, ssDNA is passed through a membrane- bound protein and changes in charge indicate which base is present. This technology is able to differentiate between cytosine and 5mC without any treatment of DNA prior to sequencing (Simpson et al., 2017). These sequencing methods are highly accurate and efficient, however they are very expensive. The PCR bisulfite-based methods for DNA methylation analysis are cost effective and relatively simple, however there are a few drawbacks to the sodium bisulfite treatment approach. Bisulfite treatment is harsh and results in DNA fragmentation, which means that a large amount of DNA is required for the analysis since much of it is damaged during the treatment (Miura et al., 2012). Ultimate conversion of unmethylated cytosines to thymine means that the complexity of the region of interest is significantly reduced because only three bases are present instead of four. This phenomenon, combined with the fragmentation of DNA caused by bisulfite treatment means that it is often not possible to amplify long fragments (Kurdyukov and Bullock, 2016). Incomplete bisulfite conversion is another problem which is encountered with bisulfite-based methods since false positive results of CpG methylation can occur. If a sequencing method is being used for the analysis then this can be mitigated by identifying cytosine nucleotides which are not within CpG sites, therefore possibly indicating incomplete bisulfite conversion. However, this approach is not possible in methods such as methylation-specific PCR or HRM that do not give results at nucleotide or CpG resolution (Hernández et al., 2013). The first method developed for the study of DNA methylation was based on the use of restriction endonucleases with selective digestion patterns (Cedar et al., 1979). Methylation sensitive restriction enzymes such as HpaII do not cleave their recognition site (5’-CCGG-3’) in the presence of 5mC, whereas MspI is not sensitive to methylation, has the same recognition site as HpaII and will cleave irrespective of cytosine methylation. Methylation-sensitive PCR (MS PCR), not to be confused with the sodium bisulfite treatment-based method methylation specific PCR, is simple and cost-effective. This method avoids the complications which accompany bisulfite conversion and can be performed in a regular thermocycler. Other early research on DNA methylation of specific regions of the genome using the MS PCR method was performed by Singer et al. (1979), who digested mouse liver DNA with HpaII and MspI; and Singer-Sam et al. (1990) who performed PCR to quantify methylation levels of mouse spleen DNA after digestion with 23 HpaII to amplify the 5’ end of the X-linked phosphoglycerate kinase gene. Both these studies made use of agarose gel electrophoresis to assess PCR amplification which indicated methylation levels of the regions they were investigating. These methods of studying methylation are only able to investigate methylation levels at the specific enzyme recognition sites and false positives are a problem due to incomplete digestion by restriction enzymes (Yegnasubramanian et al., 2006). The high GC content of the PXDN promoter limits the choice of methods for DNA methylation analysis. This is because it is virtually impossible to design primers that do not contain CpG sites, especially around the TSS which is a region of interest (see Figure 7 and 8). It is possible the HRM method could be used for this purpose due to the fact that primers containing CpG sites can be used. Methylation-specific PCR is also an option for such a GC-rich region due to the fact that different primers can be used for the analysis of methylated and unmethylated templates. However, both of these methods require bisulfite conversion which adds a significant layer of complexity to the analysis and troubleshooting process as issues such as PCR bias, incomplete conversion and false positives need to be taken into account. It is clear that, in terms of efficiency and accuracy, next-generation sequencing-based methods such as nanopore technology are powerful tools when it comes to methylation analysis, however their cost makes them relatively inaccessible. The simplicity and cost-effective nature of the MS PCR method, as well as the ability of this method to analyse GC-rich regions, make it a good choice for initial investigations into the DNA methylation levels of the PXDN promoter. 1.9 Aims and Objectives This study aimed to investigate whether methylation within the PXDN promoter could contribute to differences in expression levels of PXDN observed in breast cancer cell lines. All objectives were performed on the luminal A breast cancer cell line MCF-7, the TNBC cell line MDA-MB- 231 and the human embryonic kidney cell line HEK-293. This last cell line was included because previous studies in our laboratory had shown PXDN expression within this cell line and at the time of the commencement of this study there was no data available for PXDN expression levels within the breast cancer cell lines. Therefore the HEK-293 cell line was used as a positive control for PXDN expression. The objectives were to: 1. Examine PXDN protein expression levels in invasive MDA-MB-231 and non-invasive MCF-7 breast cancer cell lines, using immunofluorescence microscopy. 24 2. Analyse and compare PXDN promoter CpG methylation levels of the invasive and non-invasive breast cancer cell lines through MS PCR. 25 2 Methods and Materials 2.1 Immunofluorescence Microscopy Fluorescently tagged antibodies allow for the study of protein localisation, expression and activity through the tagging of specific targets (Giepmans et al., 2006). When light of the correct wave length is provided to the tagged antibody-bound proteins, the atoms within the fluorescent tag are excited and (if the protein is present and antibody-bound) will fluoresce to indicate the presence of the protein. Therefore the protein can be quantified and located in this way. For each cell line there were two control samples: one treated only with the secondary antibody to indicate non- specific binding of secondary antibody and the other with no antibodies at all to indicate background fluorescence of the cells. The experimental samples were treated with both the primary and the secondary antibodies. 2.1.1 Cell Culture All cell culture reagents are Gibco (ThermoFisher Scientific, Waltham, Massachusetts, USA), unless otherwise specified. MCF-7 (ATCC-CRL-3435) ductal breast cancer cells were kindly provided by Dr Kutlwano Xulu and Prof Tanya Augustine who had recently acquired them from the American Type Culture Collection (ATCC). The MDA-MB-231 (ATCC-HTB-26) adenocarcinoma breast cancer cells were kindly provided by Prof. Mandeep Kaur, University of the Witwatersrand. The human embryonic kidney HEK-293 (ATCC-CRL-1573) cells were a gift from Dr Clement Penny, University of the Witwatersrand. All cells were cultured in a medium composed of 1:1 Dulbecco’s Minimum Essential Medium (DMEM) to Ham’s F12 supplemented with 10% foetal bovine serum (FBS), 1% penicillin streptomycin and 10 mM N-2- hydroxyethylpiperazine-N-2-ethane sulfonic acid (HEPES) buffer and incubated at 37℃ with 5% CO2. Cells were used at a passage number between 7 and 50 in line with guidelines from the European Collection of Authenticated Cell Cultures (ECACC) (https://www.culturecollections.org.uk/). Cells were split when a confluency of 70% to 90% was reached by discarding media, rinsing three times with 1 x phosphate buffered saline (PBS) then detaching cells by adding 900 µL of trypsin-EDTA and 300 µL 1 x PBS and incubating for 9 minutes (MCF-7 cells); 300 µL trypsin-EDTA and 700 µL 1 x PBS and incubating for three minutes (HEK-293 cells); or 1.5 mL TrypLE™ Express, which is more gentle on cells, and https://www.culturecollections.org.uk/ 26 incubating for three minutes (MDA-MB-231). After incubation trypsin-EDTA/TrypLE™ Express was deactivated with an equal volume of cell culture media and the appropriate volume (depending on confluency) of cell suspension was added back into culture dish with 8 mL fresh media. The approximate seeding density used was 1 x 104 cells/cm2. 2.1.2 Fixing and Permeabilization of cells Depending on the cell line, 80 000 to 120 000 cells were seeded onto four separate autoclaved glass cover slips in a six well plate and incubated at 37°C for 48 hours. For cell treatment, cells were treated with 10 nM 𝛽-Oestradiol and incubated for 24 hours. All wash steps were performed with 2 mL of 1X PBS. Culture medium was removed and cells washed once followed by fixation in 2 mL of 3% formaldehyde (Merck, Munich, Germany) which was diluted with 1X PBS for 10 minutes at room temperature followed by three washes. Cells were then permeabilised with 0.1% triton-X 100 (Merck, Munich, Germany) for seven minutes at room temperature and then washed three times. 2.1.3 Immunostaining: Primary Antibody Cells were blocked with 2 mL of 1% bovine serum albumen (BSA) (Glentham Life Sciences, Corsham, United Kingdom) blocking solution for one hour to prevent non-specific binding of the primary antibody. The primary monoclonal antibody, mouse anti-PXDN (Santa Cruz Biotechnology, Dallas, Texas, USA) was added to two of the four coverslips in a 1:50 dilution within 1% BSA. The datasheet for this antibody suggested a primary antibody concentration of 1:200, however, the fluorescence at this concentration was very faint. Therefore, we increased the concentration of the primary antibody to 1:50. To the other coverslips 1% BSA blocking solution without any antibody was added only. Cells were then incubated for one hour at 37℃. 2.1.4 Immunostaining: Secondary Antibody and DAPI The primary antibody solution was removed and cells washed five times (from this moment on all steps were performed in the dark). A 1:200 dilution of the goat anti-mouse Kappa-FITC seconday antibody (SouthernBiotech, Birmingham, Alabama, USA) was added in 1% BSA to the relevant coverslips. Initially we tried a secondary antibody concentration of 1:500 as suggested by the 27 antibody product sheet, however when we increased the concentration to 1:200 the images were better and we weren’t seeing any non-specific binding in the controls. Only 1% BSA was added to the no antibody control coverslip. Cells were incubated in the dark for one hour at 37 ℃ followed by five washes. Cell nuclei were counterstained with 0.1 µg/ml 4, 6-diamidino-2- phenylindole (DAPI) (Merck, Munich, Germany) for five minutes which binds to adenine and thymine and therefore stains the nucleus of cells, which aids during visualisation. After staining with DAPI cells were washed three times. 2.1.5 Mounting and Visualisation Coverslips were mounted face down onto microscope slides with Fluoromount solution (Merck, Munich, Germany) and allowed to dry in the dark at room temperature for two hours. After this cells can be stored in the dark at 4 ℃. Visualisation of cells was performed with the Olympus BX63 OFM microscope (Olympus, Tokyo, Japan). The FITC was excited at 490 nm and DAPI at 350 nm. The images were all taken using the 60X oil-immersion objective lens. 2.1.6 Image Analysis and Expression Quantification One image from each treatment group was chosen per cell line. The Fiji Image J software version 9.2.0 was used to take fluorescence intensity measurements from three randomly chosen cells within each image. For each cell, three corresponding background readings were taken. The corrected total cell fluorescence (CTCF) values were calculated using the following equation: CTCF = integrated density (cell) – (area of selected cell x mean fluorescence of background readings) 2.1.7 Statistical Analysis Statistical analysis of the CTCF values was performed using the GraphPad Prism software version 9.5.1. An unpaired, parametric, Welch’s correction t-test was performed to identify significance between mean CTCF values. Mean CTCF values were plotted on a graph and error bars calculated using standard deviation from the mean. 28 2.2 Methylation Sensitive PCR This protocol involves DNA extraction followed by restriction with either a methylation insensitive (MspI) or methylation sensitive (HpaII) restriction enzyme. PCR was used to amplify across sites to determine if they had been restricted or not, indicated by absence or presence of a band in an agarose electrophoresis gel. This allowed for inference on the methylation status of that region to be made. The promoter of PXDN contains a high concentration of CpG sites within a CGI, and therefore the MS PCR protocol was considered the best-suited method to achieve this aim. The promoter of PXDN falls within a region of high GC%; a CGI. Figure 7 and Table 6 show the GC% values for a region of 1305 bp within the PXDN promoter which was examined for DNA methylation from 1037 bp upstream to 268 bp downstream of the transcription start site (TSS). Four sets of primers were used to investigate the methylation levels of the PXDN promoter region, covering a combined total of 13 MspI/HpaII recognition sites (Figure 7 and 8). All the products for section 2.2 are from New England Biolabs, Ipswich, United Kingdom unless otherwise stated. 2.2.1 Analysing the CG% of the PXDN Promoter and Restriction Enzyme Recognition Sites Figure 7 shows the GC% within the promoter of PXDN which is highest between about 200 bp upstream and 100 bp downstream of the TSS. These CG% values informed the primer design process for the MS PCR protocol. Figure 8 shows the positioning of the CCGG MspI/HpaII recognition sites which could potentially be methylated and their position relative to the binding sites for the primer pairs designed. The control amplicon was designed to not contain any CCGG recognition sites. 29 Fig. 7 The GC% of the PXDN promoter. The CG content of this region is highest between 208 bp upstream and 98 bp downstream of the TSS. CG% is lowest further upstream of the TSS, from about 750 bp upstream of the TSS. Four amplicons cover a region of 1305 bp. Amplicon 2 begins 95 bp upstream from the TSS and is 370 bp long containing five MspI/HpaII recognition sites. Amplicon 3 is 471 bp in length, begins 524 bp upstream of the TSS and contains five MspI/HpaII restriction sites. Amplicon 4 begins 935 bp upstream of the TSS, is 476 bp long and contains four MspI/HpaII restriction sites. The control amplicon contains no MspI/HpaII restriction sites; it begins 1037 bp upstream of the TSS and is 243 bp long. 30 Fig. 8 The locations of CCGG sites with respect to primer pairs and the TSS of the PXDN promoter. MspI/HpaII CCGG recognition sites can be seen in red. The P2A primer pair overlaps with exon 1 and is therefore downstream of the transcription start site. All the other primers are upstream of the 5’ UTR (untranslated region). The control amplicon does not contain any CCGG sites. 31 2.2.2 DNA Extraction 2.2.2.1 Phenol Chloroform genomic DNA Extraction Method The phenol chloroform DNA extraction method was used because of the high concentration and purity of high molecular weight genomic DNA the technique consistently produces (Ghaheri et al., 2016; Liu et al., 2022; Torii et al., 2021). All centrifugation steps were performed in a MiniSpin centrifuge (Eppendorf, Hamburg, Germany). Cells were grown to between 80% and 90% confluence, then media was removed and cells washed three times with 1 x PBS. Using a cell scraper, cells were detached in 1 mL of 1 x PBS and the suspension centrifuged at 12 000 x g. Supernatant was discarded and cells were resuspended in 50 µL of 1 x PBS. RNA was degraded by adding 1 µL of 10 mg/mL RNase A (20 µg/mL final concentration) and then 450 µL of cell lysis buffer (10 mM Tris-HCl pH 8; 100 mM EDTA pH 8; 2% SDS) was added followed by an hour incubation at 37 ℃. To remove protein, 2.5 µL of 18 mg/mL stock proteinase K was added (100 µg/mL final concentration). This was followed by a further two hours of incubation at 50 ℃ with vortexing every 20 minutes. Phenol:chloroform:isoamyl alcohol (Merck, Munich, Germany) was added in a 25:24:1 ratio so that the total volume in each tube was 1 mL. Samples were centrifuged at 16 000 x g for 25 minutes at room temperature and the upper aqueous phase removed and placed in a new tube. This step was repeated. Chloroform:isoamyl alcohol (24:1) was added to the second aqueous phase followed by vortexing for 25 minutes at 4 ℃. The upper aqueous phase was removed and placed in a new tube. This step was repeated and then to the fourth aqueous phase 0.1 volumes of 3 M sodium acetate and 2 volumes of 100% ethanol was added. Tube was stored overnight at -20 ℃. DNA was collected by centrifugation at 16 000 x g for 15 minutes at 4 ℃. Pellet was washed with 750 µL of 70% ethanol three times by centrifugation at 16 000 x g for five minutes each at 4 ℃. The pellet was air-dried at room temperature and then resuspended in ddH2O preheated to 60 ℃ to assist with the dissolving of the pellet. DNA concentration was assessed with the NanoDrop spectrophotometer (ThermoFisher Scientific, Waltham, Massachusetts, USA) and quality observed using a 1% agarose electrophoresis gel for 30 minutes at 100 V (method described under 2.4.4). See appendix for an example of DNA extraction results. 32 2.2.2.2 Genomic DNA Extraction Kit Method When troubleshooting problems with the MS PCR protocol, it was suspected that inhibitors may have been present within gDNA samples which were carried over from the phenol chloroform DNA extraction method. Therefore, the GeneJet genomic DNA purification kit (ThermoFisher Scientific, Waltham, Massachusetts, USA) was used to compare PCR results from samples extracted using both methods. All the reagents in section 2.2.2.2 are from ThermoFisher Scientific, Waltham, Massachusetts, USA unless otherwise specified. Media was removed and cells washed three times with 1 x PBS, then using a cell scraper cells were detached in 1 mL of 1 x PBS and the suspension centrifuged at 250 x g for five minutes. Supernatant was discarded and cells were resuspended in 200 µL of 1 x PBS. A volume of 200 µL of Lysis solution and 20 µL of proteinase K solution were added to cell suspension which was then gently vortexed and incubated at 56 ℃ for 10 minutes. To digest RNA, 20 µL of RNase A solution was added to the tube followed by vortexing and a further 10 minutes incubation period at room temperature. An amount of 400 µL of 50% ethanol was added, the solution was gently mixed by pipetting and then transferred to a GeneJet Genomic DNA Purification Column within a collection tube. The column was centrifuged for one minute at 6000 x g then placed in a new collection tube (the flow-through was discarded). This was followed by the addition of 500 µL of wash buffer I to the column, which was centrifuged at 8000 x g for one minute before discarding the flow through. For the final wash step, 500 µL of wash buffer II was added, followed by a three minute centrifugation at 16 000 x g. The flowthrough was discarded and the column placed in a 1.5 mL nuclease-free microcentrifuge tube. Lastly, 150 µL of ddH2O was added to the column, incubated at room temperature for two minutes and then centrifuged at 8000 x g for one minute to elute the DNA. DNA concentration was assessed with the NanoDrop spectrophotometer (ThermoFisher Scientific, Waltham, Massachusetts, United States) and quality observed using a 1% agarose electrophoresis gel for 30 minutes at 100 V (method described under 2.2.5). See appendix for example of DNA extraction results. 2.2.3 Restriction Digest HpaII and MspI are restriction enzymes which are methylation sensitive and insensitive respectively. The restriction digest is performed before PCR because the methyl groups bound to the template will not be replicated along with the template during PCR, therefore the differential digestion by the two different enzymes will only be informative with regard to DNA methylation 33 before amplification. The reaction components are listed below in Table 2. For each cell line, three digest reactions were set up: 1) gDNA restricted with the MspI methylation insensitive restriction enzyme. 2) gDNA restricted by the HpaII methylation sensitive restriction enzyme, and 3) a no enzyme control where gDNA has been incubated at the same concentration with the same buffer, however an extra 1 µL ddH2O was added instead of a restriction enzyme. The digest was incubated at 37 ℃ for one hour and the HpaII enzyme was inactivated by incubation at 80 ℃ for 20 minutes. The MspI restriction enzyme does not require an inactivation step. Agarose gel electrophoresis on a 1% gel for 30 minutes at 100 V was used to check the quality of digest (method described under 2.4.4). Table 2: Components and volumes of MspI/HpaII restriction digest reactions Component Volume (µL) gDNA (100 ng/µL) 7 rCutSmart Buffer 5 Restriction Enzyme (MspI or HpaII)* 1 ddH2O Up to 50 *20 000 units/mL 2.2.4 PCR After the restriction digest, PCR was used to amplify the DNA, which revealed the presence or absence of CpG methylation within regions of the PXDN promoter due to differential digestion by the two restriction enzymes depending on the methylation status of the amplified region. Kapa Taq ReadyMix PCR Kit (Merck, Munich, Germany) was the DNA polymerase mastermix used (0.5 U Taq DNA Polymerase, 0.2 mM of each dNTP and 1.5 mM MgCl2 at 1X within a 25 µL reaction). The components of the PCR reaction are detailed in Table 3. For each primer pair and cell line, four PCR reactions were set up: 1) MspI-digested gDNA, 2) HpaII-digested gDNA, 3) no enzyme control gDNA, 4) no template control (NTC). 34 Table 3: PCR reaction components and volumes for amplification of regions within the PXDN promoter Component Volume (µL) Restricted gDNA (14 ng/µL) 7.1 2X KAPA Taq ReadyMix with dye 12.5 Forward Primer (10 µM) 1 Reverse Primer (10 µM) 1 ddH2O Up to 25 The four primer sets used to amplify part of the CpG island found in the promoter of PXDN are shown below (Tables 4 and 5) and were purchased from Inqaba Biotec, Pretoria, Gauteng. P2A and the control primers were designed in NCBI Primer Blast, P3 and P4 were designed previously in our laboratory (Hanmer and Mavri-Damelin, 2018). The sequences of the primer pairs used in this study, as well as the details of their amplicons such as GC% and the number of MspI/HpaII recognition sites are detailed in Tables 4 and 5 respectively. The PCR reaction conditions are detailed in Table 6. The results of the PCR reaction were analysed by agarose gel electrophoresis on a 1.7% gel at 100 V for 45 to 50 minutes. Table 4: Primers designed for the amplification of regions within the PXDN promoter Table 5: The amplicons within the PXDN promoter and their primer parameters Position (with reference to transcription start site) Length (bp) Number of MspI/HpaII recognition sites GC% Annealing Temperature of primers (℃) P2A -95 bp to 268 bp 370 5 76.21 58 P3 -524 bp to -53 bp 471 5 62.63 58 P4 -935 bp to -459 bp 476 4 59.03 55 Control -1037 to -794 243 0 53.91 58 Primer Forward Primer (5’ to 3’) Reverse Primer (5’ to 3’) P2A CCTCGGGGATTCAGAGGGG GCACTCACAGGATGGAGGTC P3 CAGACTCCCTTGCTGTGCGCTTTG AGCTGTGCACATGCGCGAGGCT P4 TCTGAATCTGGCACCGTCACCGTC ACCCTG AGCTGTGCACATGCGCGAGGCT Control TCCCATTCCAGGCTGCTTTC ATACGCACAAAGGTGGCGTT 35 Table 6: PCR reaction conditions It became necessary to amplify a region of the genome separate to that of PXDN (see section 3.2). Therefore, the following primer pair shown in Table 7 was used to amplify a region of the TP53 promoter on chromosome 17. The amplicon is 805 bp and the primers have an annealing temperature of 59.3 ℃. Table 7: Primer pair designed to amplify the TP53 promoter 2.2.5 Agarose Gel Electrophoresis Agarose gels were prepared for the assessment of gDNA quality after extraction, and the analysis of all MS PCR results. The appropriate mass of agarose powder was added to 50 ml 1 X Tris acetate EDTA (TAE) Buffer. Agarose powder (Cleaver Scientific, Rugby, United Kingdom) and buffer solution was heated using a microwave until the powder completely dissolved. A volume of 5 µL of DNA intercalating agent ethidium bromide (10 mg/ml) (Merck, Munich, Germany) was added after the agarose had cooled sufficiently and then the solution was poured into a casting tray and left to set for roughly 30 minutes. The KapaTaq DNA poly