Electronic Theses and Dissertations (Masters)
Permanent URI for this collection
Browse
Browsing Electronic Theses and Dissertations (Masters) by Keyword "Parallelising highly sequential algorithms"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item Model Propagation for High-Parallelism in Data Compression(University of the Witwatersrand, Johannesburg, 2023-10) Lin, Shaw Chian; Cheng, LingRecent data compression research focuses on the parallelisation of existing algorithms (LZ77, BZIP2 etc.) by exploiting their inherent parallelism. Little work has been performed on parallelising highly sequential algorithms, whose slow compression speeds would benefit the most from parallelism. This dissertation presents a generalised parallelisation approach that can be potentially adopted by any compression algorithms, with model sequentiality in mind. The scheme presents a novel divide-and-conquer approach when dividing the data stream into smaller data blocks for parallelisation. The scheme, branching propagation, is implemented with prediction by partial matching (PPM), an algorithm of the statistical-modelling family known for their serial nature, which is shown to suffer from compression ratio increases when parallelised. A speedup of 5.2-7x has been achieved at 16 threads, with at most a 6.5% increase in size relative to serial performance, while the conventional approach showed up to a 7.5x speedup with an 8.0% increase. The branching propagation approach has been shown to offer better compression ratios over conventional approaches with increasing parallelism (a difference of 11% increase at 256 threads), albeit at slightly slower speeds. To quantify the speedup over ratio penalty, an alternate metric called speedup-to-ratio increase (SRI) is used. This shows that when serial dependency is maintained, branching propagation is superior in standard configurations, which offers substantial speed while minimising the compression ratio penalty relative to the speedup. However, at lower serial dependency, the conventional approach is generally preferable, with 9-16x speedup per 1% increase in compression ratio at maximal speed compared to branching propagation’s 6-13x speedup per 1%.