CROCUS: Cluster-based ontology data cleansing
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Standard
WaSABi-FEOSW 2014 : Joint Proceedings of WaSABi 2014 and FEOSW 2014. ed. / Angel García-Crespo; Juan Miguel Gómez Berbís; Mateusz Radzimski; José Luis Sánchez Cervantes; Sam Coppens; Karl Hammar; Magnus Knuth; Marco Neumann; Dominique Ritze; Miel Vander Sande. Vol. 1240 Sun Site Central Europe (RWTH Aachen University), 2014. p. 7-14 (CEUR Workshop Proceedings; Vol. 1240).
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - CHAP
T1 - CROCUS
T2 - Joint 2nd International Workshop on Semantic Web Enterprise Adoption and Best Practice, WaSABi 2014 and 2nd International Workshop on Finance and Economics on the Semantic Web, FEOSW 2014 - Co-located with 11th European Semantic Web Conference, ESWC 2014
AU - Cherix, Didier
AU - Usbeck, Ricardo
AU - Both, Andreas
AU - Lehmann, Jens
N1 - Conference code: 11
PY - 2014
Y1 - 2014
N2 - Over the past years, a vast number of datasets have been published based on Semantic Web standards, which provides an opportunity for creating novel industrial applications. However, industrial requirements on data quality are high while the time to market as well as the required costs for data preparation have to be kept low. Unfortunately, many Linked Data sources are error-prone which prevents their direct use in productive systems. Hence, (semi-)automatic quality assurance processes are needed as manual ontology repair procedures by domain experts are expensive and time consuming. In this article, we present CROCUS - A pipeline for cluster-based ontology data cleansing. Our system provides a semi-automatic approach for instance-level error detection in ontologies which is agnostic of the underlying Linked Data knowledge base and works at very low costs. CROCUS was evaluated on two datasets. The experiments show that we are able to detect errors with high recall.
AB - Over the past years, a vast number of datasets have been published based on Semantic Web standards, which provides an opportunity for creating novel industrial applications. However, industrial requirements on data quality are high while the time to market as well as the required costs for data preparation have to be kept low. Unfortunately, many Linked Data sources are error-prone which prevents their direct use in productive systems. Hence, (semi-)automatic quality assurance processes are needed as manual ontology repair procedures by domain experts are expensive and time consuming. In this article, we present CROCUS - A pipeline for cluster-based ontology data cleansing. Our system provides a semi-automatic approach for instance-level error detection in ontologies which is agnostic of the underlying Linked Data knowledge base and works at very low costs. CROCUS was evaluated on two datasets. The experiments show that we are able to detect errors with high recall.
KW - Informatics
KW - Business informatics
UR - http://www.scopus.com/inward/record.url?scp=84924952819&partnerID=8YFLogxK
M3 - Article in conference proceedings
AN - SCOPUS:84924952819
VL - 1240
T3 - CEUR Workshop Proceedings
SP - 7
EP - 14
BT - WaSABi-FEOSW 2014
A2 - García-Crespo, Angel
A2 - Gómez Berbís, Juan Miguel
A2 - Radzimski, Mateusz
A2 - Cervantes, José Luis Sánchez
A2 - Coppens, Sam
A2 - Hammar, Karl
A2 - Knuth, Magnus
A2 - Neumann, Marco
A2 - Ritze, Dominique
A2 - Sande, Miel Vander
PB - Sun Site Central Europe (RWTH Aachen University)
Y2 - 26 May 2014
ER -