A metadata service for an infrastructure of large scale distributed scientific datasets
dc.contributor.author | Adeleke, Oluwalani Aeoluwa | |
dc.date.accessioned | 2014-06-12T13:03:12Z | |
dc.date.available | 2014-06-12T13:03:12Z | |
dc.date.issued | 2014-06-12 | |
dc.description.abstract | In this constantly growing information technology driven era, data migration and replication pose a serious bottleneck in the distributed database infrastructure envi- ronment. For large heterogeneous environments with domains such as geospatial sci- ence and high energy physics, where large array of scienti c data are involved, diverse challenges are encountered with respect to dataset identi cation, location services, and e cient retrieval of information. These challenges include locating data sources, identifying e ective transfer route, and replication, just to mention a few. As dis- tributed systems aimed at constant delivery of data to the point of query origination continue to expand in size and functionality, e cient replication and data retrieval systems have subsequently become increasingly important and relevant. One such system is an infrastructure for large scale distributed scienti c data management. Several data management systems have been developed to help manage these fast growing datasets and their metadata. However little work has been done on allowing cross-communication and data-sharing between these di erent dataset management systems in a distributed, heterogeneous environment. This dissertation addresses this problem, focusing particularly on metadata and provenance service associated with it. We present the Virtual Uni ed Metadata architecture to establish communication between remote sites within a distributed heterogeneous environment using a client-server model. The system provides a frame- work that allows heterogeneous metadata services communicate and share metadata and datasets through the implementation of a communication interface. It allows for metadata discovery and dataset identi cation by enabling remote query between heterogeneous metadata repositories. The signi cant contributions of this system include: { the design and implementation of a client/server based remote metadata query system for scienti c datasets within distributed heterogeneous dataset reposito- ries; { Implementation of a caching mechanism for optimizing the system performance; { Analyzing the quality of service with respect to correct dataset identi cation, estimation of migration and replication time frame, and cache performance. | en_ZA |
dc.identifier.uri | http://hdl.handle.net10539/14780 | |
dc.language.iso | en | en_ZA |
dc.subject.lcsh | Metadata. | |
dc.subject.lcsh | Database management. | |
dc.subject.lcsh | Management information systems. | |
dc.title | A metadata service for an infrastructure of large scale distributed scientific datasets | en_ZA |
dc.type | Thesis | en_ZA |