Parallel algorithms for querying spatial properties in the protein data bank
No Thumbnail Available
Date
2019
Authors
Selvan, Joshua
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Searching large protein databases for proteins with certain structural properties is
expensive. This research explored the use of GPGPUs (General Purpose Graphical
Processing Units) in speeding up such structural queries. Brute force and kd-tree
spatial data structure algorithms were compared and benchmarked against non-GPU
parallel algorithms to assess the e ectiveness of using GPGPUs.
This was done with the aim of increasing the speed at which queries against large
protein databases can be completed to help mitigate the e ect of increasing data set
sizes of current protein databases [56].
A set of parallel variations of range search algorithms were developed and imple-
mented in the GPU programming language CUDA and their performances times in
completing batch range search jobs were compared against other parallel approach
types such as multi-threading and message passing to see if the GPU approaches
completed notably faster or slower than more traditional parallelised approaches.
The results showed GPGPUs can construct kd-trees far faster than other parallelised
implementations can achieve and that in most scenarios (excluding speci c cases
such as very low or zero result searches) the GPGPU approaches either matched or
performed far better than the other parallelised approaches.
While comparing di erent GPU algorithms, the complex GPU based kd-tree algo-
rithm performed similarly to a simple GPU brute force range search. This high-
lighted the bene ts of writing code which made the most of the GPU's parallel
architecture as opposed to modifying e cient (recursive) algorithms to adequately
t into those same GPU architectures. This implied that even though spatial data
structures are e ective ways of dealing with protein data, there are better returns
on e ort in writing code speci cally for the GPU's inherently parallel architecture
for initiatives which require algorithms to be developed from scratch.
Description
A research report submitted to the Faculty of Engineering and the Built Environ-
ment, University of the Witwatersrand, Johannesburg, in partial ful lment of the
requirements for the degree of Master of Science in Engineering.
Keywords
Citation
Selvan, Joshua (2019) Parallel algorithms for querying spatial properties in the protein data bank, University of the Witwatersrand, Johannesburg, <http://hdl.handle.net/10539/30383>