Parallel algorithms for querying spatial properties in the protein data bank

No Thumbnail Available

Date

2019

Authors

Selvan, Joshua

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Searching large protein databases for proteins with certain structural properties is expensive. This research explored the use of GPGPUs (General Purpose Graphical Processing Units) in speeding up such structural queries. Brute force and kd-tree spatial data structure algorithms were compared and benchmarked against non-GPU parallel algorithms to assess the e ectiveness of using GPGPUs. This was done with the aim of increasing the speed at which queries against large protein databases can be completed to help mitigate the e ect of increasing data set sizes of current protein databases [56]. A set of parallel variations of range search algorithms were developed and imple- mented in the GPU programming language CUDA and their performances times in completing batch range search jobs were compared against other parallel approach types such as multi-threading and message passing to see if the GPU approaches completed notably faster or slower than more traditional parallelised approaches. The results showed GPGPUs can construct kd-trees far faster than other parallelised implementations can achieve and that in most scenarios (excluding speci c cases such as very low or zero result searches) the GPGPU approaches either matched or performed far better than the other parallelised approaches. While comparing di erent GPU algorithms, the complex GPU based kd-tree algo- rithm performed similarly to a simple GPU brute force range search. This high- lighted the bene ts of writing code which made the most of the GPU's parallel architecture as opposed to modifying e cient (recursive) algorithms to adequately t into those same GPU architectures. This implied that even though spatial data structures are e ective ways of dealing with protein data, there are better returns on e ort in writing code speci cally for the GPU's inherently parallel architecture for initiatives which require algorithms to be developed from scratch.

Description

A research report submitted to the Faculty of Engineering and the Built Environ- ment, University of the Witwatersrand, Johannesburg, in partial ful lment of the requirements for the degree of Master of Science in Engineering.

Keywords

Citation

Selvan, Joshua (2019) Parallel algorithms for querying spatial properties in the protein data bank, University of the Witwatersrand, Johannesburg, <http://hdl.handle.net/10539/30383>

Collections

Endorsement

Review

Supplemented By

Referenced By