Browsing by Author "Michelow, Pamela"

Now showing 1 - 2 of 2

Synthetic cytology image generation to augment teaching and quality assurance in pathology
(University of the Witwatersrand, Johannesburg, 2023-05) McAlpine, Ewen David; Michelow, Pamela; Celik, Turgay
INTRODUCTION: Urine cytology offers rapid and relatively inexpensive screening for the detection of high-grade urothelial neoplasia in patients with haematuria. In our setting of a public sector laboratory in South Africa, however, there is a paucity of such specimens with which to train cytotechnologists and cytopathologists. Advancements in Generative Adversarial Networks present a potential solution to this problem by allowing for the generation of synthetic urine cytology images to supplement teaching and training. We illustrate an end-to-end machine learning model – from dataset creation to testing synthetic images in pathology personnel – to assess this technology in a real-world setting. METHODS: Two hundred and fourteen urine cytology slides were digitised and processed to construct a morphologically balanced dataset containing examples of benign, atypical and malignant urine cytology images. This dataset was used to train a StyleGAN3 model to generate synthetic urine cytology images. These synthetically generated images were then tested in a group of pathology personnel – both pathologists and trainees – to assess whether a difference between real and synthetic urine cytology images exists. Diagnostic error rate and subject image assessment were tested. RESULTS: StyleGAN3 was able to generate a wide morphological diversity of realistically appearing benign, atypical and malignant urine cytology images. When testing how these synthetic images were perceived by pathology personnel, there was no significant difference in diagnostic error rate, subjective image quality or inclusion of synthetic images in a cytology teaching set. DISCUSSION: This work presents a proof-of-concept illustration of the feasibility of the use of synthetic cytology images to supplement pathology teaching when real examples may be difficult to obtain. Furthermore, this work presents important insights into the dynamics of pathology dataset creation and discusses the use of synthetic data in health education and the ethical and legal issues that arise with the use of synthetic patient data. CONCLUSION: Our work demonstrates that realistic, morphologically diverse urine cytology images can be generated using existing GANs technology and that human observers find such synthetic data visually acceptable. Additionally, our data indicate that there is no significant difference in synthetic data in terms of subjective image quality or diagnostic classification as determined by pathology personnel.
The dynamics of pathology dataset creation using urine cytology as an example
McAlpine, Ewen; Michelow, Pamela; Celik, Turgay
Introduction: Dataset creation is one of the first tasks required for training AI algorithms but is underestimated in pathology. High-quality data are essential for training algorithms and data should be labelled accurately and include sufficient morphological diversity. The dynamics and challenges of labelling a urine cytology dataset using The Paris System (TPS) criteria are presented. Methods: 2,454 images were labelled by pathologist consensus via video conferencing over a 14-day period. During the labelling sessions, the dynamics of the labelling process were recorded. Quality assurance images were randomly selected from images labelled in previous sessions within this study and randomly distributed throughout new labelling sessions. To assess the effect of time on the labelling process, the labelled set of images was split into 2 groups according to the median relative label time and the time taken to label images and intersession agreement were assessed. Results: Labelling sessions ranged from 24 m 11 s to 41 m 06 s in length, with a median of 33 m 47 s. The majority of the 2,454 images were labelled as benign urothelial cells, with atypical and malignant urothelial cells more sparsely represented. The time taken to label individual images ranged from 1 s to 42 s with a median of 2.9 s. Labelling times differed significantly among categories, with the median label time for the atypical urothelial category being 7.2 s, followed by the malignant urothelial category at 3.8 s and the benign urothelial category at 2.9 s. The overall intersession agreement for quality assurance images was substantial. The level of agreement differed among classes of urothelial cells - benign and malignant urothelial cell classes showed almost perfect agreement and the atypical urothelial cell class showed moderate agreement. Image labelling times seemed to speed up, and there was no evidence of worsening of intersession agreement with session time. Discussion/conclusion: Important aspects of pathology dataset creation are presented, illustrating the significant resources required for labelling a large dataset. We present evidence that the time taken to categorise urine cytology images varies by diagnosis/class. The known challenges relating to the reproducibility of the AUC (atypical) category in TPS when compared to the NHGUC (benign) or HGUC (malignant) categories is also confirmed.