A metadata service for an infrastructure of large scale distributed scientific datasets

dc.contributor.authorAdeleke, Oluwalani Aeoluwa
dc.date.accessioned2014-06-12T13:03:12Z
dc.date.available2014-06-12T13:03:12Z
dc.date.issued2014-06-12
dc.description.abstractIn this constantly growing information technology driven era, data migration and replication pose a serious bottleneck in the distributed database infrastructure envi- ronment. For large heterogeneous environments with domains such as geospatial sci- ence and high energy physics, where large array of scienti c data are involved, diverse challenges are encountered with respect to dataset identi cation, location services, and e cient retrieval of information. These challenges include locating data sources, identifying e ective transfer route, and replication, just to mention a few. As dis- tributed systems aimed at constant delivery of data to the point of query origination continue to expand in size and functionality, e cient replication and data retrieval systems have subsequently become increasingly important and relevant. One such system is an infrastructure for large scale distributed scienti c data management. Several data management systems have been developed to help manage these fast growing datasets and their metadata. However little work has been done on allowing cross-communication and data-sharing between these di erent dataset management systems in a distributed, heterogeneous environment. This dissertation addresses this problem, focusing particularly on metadata and provenance service associated with it. We present the Virtual Uni ed Metadata architecture to establish communication between remote sites within a distributed heterogeneous environment using a client-server model. The system provides a frame- work that allows heterogeneous metadata services communicate and share metadata and datasets through the implementation of a communication interface. It allows for metadata discovery and dataset identi cation by enabling remote query between heterogeneous metadata repositories. The signi cant contributions of this system include: { the design and implementation of a client/server based remote metadata query system for scienti c datasets within distributed heterogeneous dataset reposito- ries; { Implementation of a caching mechanism for optimizing the system performance; { Analyzing the quality of service with respect to correct dataset identi cation, estimation of migration and replication time frame, and cache performance.en_ZA
dc.identifier.urihttp://hdl.handle.net10539/14780
dc.language.isoenen_ZA
dc.subject.lcshMetadata.
dc.subject.lcshDatabase management.
dc.subject.lcshManagement information systems.
dc.titleA metadata service for an infrastructure of large scale distributed scientific datasetsen_ZA
dc.typeThesisen_ZA

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
thesis.pdf
Size:
1.6 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections