Classification of web resident sensor resources using latent semantic indexing and ontologies
No Thumbnail Available
Date
2010-04-01T07:20:54Z
Authors
Majavu, Wabo
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Web resident sensor resource discovery plays a crucial role in the realisation
of the Sensor Web. The vision of the Sensor Web is to create a web of sensors that
can be manipulated and discovered in real time. A current research challenge in the
sensor web is the discovery of relevant web sensor resources. The proposed approach
towards solving the discovery problem is to implement a modified Latent Semantic
Indexing(LSI) by making use of an ontology for classifying Web Resident Resources
found in geospatial web portals. This research introduces a new method aimed at
improving an information retrieval algorithm, infl
uencing the vector decomposition
by including a formal representation of the knowledge of the domain of interest.
The aim is to bias the retrieval to better classify the resources of interest. The
proposed method uses the domain knowledge, expressed in the ontology to improve
the knowledge extraction by using the concept defi nitions and relationships in the
ontology to create semantic links between documents. The clusters formed using
the modified algorithm are analysed and performance measured by evaluating the
inter-cluster distances and similarity measures within each cluster. The distances
are expressed as Euclidean distances of vectors in n-dimensional latent space. The
research focus is on investigating how the prior domain knowledge improves the
clustering when k-means is used as the partitioning algorithm. It is observed that
the modified extraction algorithm can isolate a group of documents that are used to
populate the knowledge base, therefore resulting in improved storage of the documents
that occur in the geospatial portal. Results found using the combination of ontology
and LSI show that clusters are better separated and homogeneous clusters of more
specific themes can be formed by hierarchical clustering.