3. Electronic Theses and Dissertations (ETDs) - All submissions
Permanent URI for this communityhttps://wiredspace.wits.ac.za/handle/10539/45
Browse
2 results
Search Results
Item SQL comprehension and synthesis(2020) Obaido, George RabeshiStructured Query Language (SQL) remains the standard language used in Relational Database Management Systems (RDBMSs), and has found applications in healthcare (patient registries), businesses (inventories, trend analysis), military, education, etc. Although SQL statements are English-like, the process of writing SQL queries is often problematic for non-technical end-users in the industry. Similarly, formulating and comprehending written queries can be confusing, especially for undergraduate students. One of the pivotal reasons given for these difficulties lies with the simple syntax of SQL, which is often misleading and hard to understand. An ideal solution is to present these two audiences: undergraduate students and non-technical end-users with learning and practice tools. These tools are mostly electronic, and can be used to aid their understanding, as well as enable them write correct SQL queries. This work proposes a new approach aimed at understanding and writing correct SQL queries using principles from Formal Language and Automata Theory. We present algorithms based on: regular expressions for the recognition of simple query constructs, context-free grammars for the recognition of nested queries and a jumping finite automaton for the synthesis of SQL queries from natural language descriptions. As proof of concept, these algorithms were further implemented into interactive software tools aimed at improving SQL comprehension. Evaluation of these tools showed that the majority of participants agreed that the tools were intuitive and aided their understanding of SQL queries. These tools should, therefore, find applications in aiding SQL comprehension at higher learning institutions and assist in the writing of correct queries in data-centered industriesItem Structural analysis of source code plagiarism using graphs(2017) Obaido, George RabeshiPlagiarism is a serious problem in academia. It is prevalent in the computing discipline where students are expected to submit source code assignments as part of their assessment; hence, there is every likelihood of copying. Ideally, students can collaborate with each other to perform a programming task, but it is expected that each student submit his/her own solution for the programming task. More so, one might conclude that the interaction would make them learn programming. Unfortunately, that may not always be the case. In undergraduate courses, especially in the computer sciences, if a given class is large, it would be unfeasible for an instructor to manually check each and every assignment for probable plagiarism. Even if the class size were smaller, it is still impractical to inspect every assignment for likely plagiarism because some potentially plagiarised content could still be missed by humans. Therefore, automatically checking the source code programs for likely plagiarism is essential. There have been many proposed methods that attempt to detect source code plagiarism in undergraduate source code assignments but, an ideal system should be able to differentiate actual cases of plagiarism from coincidental similarities that usually occur in source code plagiarism. Some of the existing source code plagiarism detection systems are either not scalable, or performed better when programs are modified with a number of insertions and deletions to obfuscate plagiarism. To address this issue, a graph-based model which considers structural similarities of programs is introduced to address cases of plagiarism in programming assignments. This research study proposes an approach to measuring cases of similarities in programming assignments using an existing plagiarism detection system to find similarities in programs, and a graph-based model to annotate the programs. We describe experiments with data sets of undergraduate Java programs to inspect the programs for plagiarism and evaluate the graph-model with good precision. An evaluation of the graph-based model reveals a high rate of plagiarism in the programs and resilience to many obfuscation techniques, while false detection (coincident similarity) rarely occurred. If this detection method is adopted into use, it will aid an instructor to carry out the detection process conscientiously.