Parallel algorithms for querying spatial properties in the protein data bank

dc.contributor.authorSelvan, Joshua
dc.date.accessioned2021-01-26T06:08:34Z
dc.date.available2021-01-26T06:08:34Z
dc.date.issued2019
dc.descriptionA research report submitted to the Faculty of Engineering and the Built Environ- ment, University of the Witwatersrand, Johannesburg, in partial ful lment of the requirements for the degree of Master of Science in Engineering.en_ZA
dc.description.abstractSearching large protein databases for proteins with certain structural properties is expensive. This research explored the use of GPGPUs (General Purpose Graphical Processing Units) in speeding up such structural queries. Brute force and kd-tree spatial data structure algorithms were compared and benchmarked against non-GPU parallel algorithms to assess the e ectiveness of using GPGPUs. This was done with the aim of increasing the speed at which queries against large protein databases can be completed to help mitigate the e ect of increasing data set sizes of current protein databases [56]. A set of parallel variations of range search algorithms were developed and imple- mented in the GPU programming language CUDA and their performances times in completing batch range search jobs were compared against other parallel approach types such as multi-threading and message passing to see if the GPU approaches completed notably faster or slower than more traditional parallelised approaches. The results showed GPGPUs can construct kd-trees far faster than other parallelised implementations can achieve and that in most scenarios (excluding speci c cases such as very low or zero result searches) the GPGPU approaches either matched or performed far better than the other parallelised approaches. While comparing di erent GPU algorithms, the complex GPU based kd-tree algo- rithm performed similarly to a simple GPU brute force range search. This high- lighted the bene ts of writing code which made the most of the GPU's parallel architecture as opposed to modifying e cient (recursive) algorithms to adequately t into those same GPU architectures. This implied that even though spatial data structures are e ective ways of dealing with protein data, there are better returns on e ort in writing code speci cally for the GPU's inherently parallel architecture for initiatives which require algorithms to be developed from scratch.en_ZA
dc.description.librarianPH2021en_ZA
dc.facultyFaculty of Engineering and the Built Environmenten_ZA
dc.format.extentOnline resource (136 leaves)
dc.identifier.citationSelvan, Joshua (2019) Parallel algorithms for querying spatial properties in the protein data bank, University of the Witwatersrand, Johannesburg, <http://hdl.handle.net/10539/30383>
dc.identifier.urihttps://hdl.handle.net/10539/30383
dc.language.isoenen_ZA
dc.subject.lcshParallel programming (Computer science)
dc.subject.lcshSpatial analysis (Statistics)
dc.subject.lcshAlgorithms
dc.titleParallel algorithms for querying spatial properties in the protein data banken_ZA
dc.typeThesisen_ZA

Files

Original bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
Abstract.pdf
Size:
49.4 KB
Format:
Adobe Portable Document Format
Description:
Abstract
No Thumbnail Available
Name:
Parallel algorithms for querying spatial properties in the protein data bank.pdf
Size:
1.09 MB
Format:
Adobe Portable Document Format
Description:
Final work

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections