The role of graph theory in big data analytics
No Thumbnail Available
Date
2019
Authors
Mashia, Kellen
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Subgraph isomorphism detection is a computationally expensive problem which determines
if there exists a subgraph Gs = (Vs, Es) in a large graph (called the data graph) denoted by
G = (VG, EG); which matches some other given smaller graph Q = (VQ, EQ) (called the pattern
graph). According to J. R. Ullmann in [19], the subgraph isomorphism problem can be solved
through a brute-force enumeration procedure. In this dissertation we extensively study the
Ullmann algorithm for subgraph isomorphism detection along with some of the improvements
developed in [3]; and then we shall discuss the performance of the algorithm by conducting a
simulation analysis for distinct, randomly generated undirected graphs. The algorithm is theoretically
compared with the one proposed by L. P. Cordella in [6]. However, we only carried out
an efficiency analysis on the Ullmann algorithm due to lack of large sets of experimental data.
Much work has been done in this field, however, it still stands to be an open question in computer
science to improve the execution run-time of the algorithm. In future research, we intend
to revisit the graph and subgraph isomorphism detection problems as a possible PhD research
in Computer Science. In chapter one we introduced preliminary definitions which may be of
great help in understanding the main work in chapter two. In chapter two we study in detail
the Ullmann algorithm for the subgraph isomorphism problem which we then implement on
several examples to examine the execution run-time of the algorithm. The purpose of chapter
three is to introduce another useful algorithm for subgraph isomorphism detection, called the
VF2. In this chapter we study in detail the VF2 algorithm for subgraph isomorphism detection.
In chapter four we give a detailed simulation analysis by implementing the Ullmann algorithm
on one large data graph G = (VG, EG) of order 30 and size 30, along with 15 randomly generated
undirected graphs Q = (VQ, EQ). In this chapter we examine the algorithms efficiency by
applying it on all 15 random graphs to search for isomorphic subgraphs in the large data graph.
Chapter five is the concluding chapter and chapter six entails the appendices.
Description
A dissertation submitted in the fulfilment of the requirement for the Master of Science in the School of Mathematics, 2019