Lessons learned — The case of CROCUS: Cluster-based ontology data cleansing

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

Lessons learned — The case of CROCUS: Cluster-based ontology data cleansing. / Cherix, Didier; Usbeck, Ricardo; Both, Andreas et al.
The Semantic Web: ESWC 2014 Satellite Events: ESWC 2014 Satellite Events, Anissaras, Crete, Greece, May 25-29, 2014. ed. / Anna Tordai; Eva Blomqvist; Harald Sack; Raphaël Troncy; Valentina Presutti; Ioannis Papadakis. Springer Nature Switzerland AG, 2014. p. 14-24 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8798).

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Cherix, D, Usbeck, R, Both, A & Lehmann, J 2014, Lessons learned — The case of CROCUS: Cluster-based ontology data cleansing. in A Tordai, E Blomqvist, H Sack, R Troncy, V Presutti & I Papadakis (eds), The Semantic Web: ESWC 2014 Satellite Events: ESWC 2014 Satellite Events, Anissaras, Crete, Greece, May 25-29, 2014. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8798, Springer Nature Switzerland AG, pp. 14-24, 11th European Semantic Web Symposium on Satellite Events, ESWC 2014, Ouro Preto, Brazil, 20.10.14. https://doi.org/10.1007/978-3-319-11955-7_2

APA

Cherix, D., Usbeck, R., Both, A., & Lehmann, J. (2014). Lessons learned — The case of CROCUS: Cluster-based ontology data cleansing. In A. Tordai, E. Blomqvist, H. Sack, R. Troncy, V. Presutti, & I. Papadakis (Eds.), The Semantic Web: ESWC 2014 Satellite Events: ESWC 2014 Satellite Events, Anissaras, Crete, Greece, May 25-29, 2014 (pp. 14-24). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8798). Springer Nature Switzerland AG. https://doi.org/10.1007/978-3-319-11955-7_2

Vancouver

Cherix D, Usbeck R, Both A, Lehmann J. Lessons learned — The case of CROCUS: Cluster-based ontology data cleansing. In Tordai A, Blomqvist E, Sack H, Troncy R, Presutti V, Papadakis I, editors, The Semantic Web: ESWC 2014 Satellite Events: ESWC 2014 Satellite Events, Anissaras, Crete, Greece, May 25-29, 2014. Springer Nature Switzerland AG. 2014. p. 14-24. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-319-11955-7_2

Bibtex

@inbook{7d48758e5fea430388447af2836d1ab0,
title = "Lessons learned — The case of CROCUS: Cluster-based ontology data cleansing",
abstract = "Over the past years, a vast number of datasets have been published based on Semantic Web standards, which provides an opportunity for creating novel industrial applications. However, industrial requirements on data quality are high while the time to market as well as the required costs for data preparation have to be kept low. Unfortunately, many Linked Data sources are error-prone which prevents their direct use in productive systems. Hence, (semi-)automatic quality assurance processes are needed as manual ontology repair procedures by domain experts are expensive and time consuming. In this article, we present CROCUS – a pipeline for cluster-based ontology data cleansing. Our system provides a semi-automatic approach for instance-level error detection in ontologies which is agnostic of the underlying Linked Data knowledge base and works at very low costs. CROCUS has been evaluated on two datasets. The experiments show that we are able to detect errors with high recall. Furthermore, we provide an exhaustive related work as well as a number of lessons learned.",
keywords = "Informatics, Business informatics",
author = "Didier Cherix and Ricardo Usbeck and Andreas Both and Jens Lehmann",
note = "This work has been partly supported by the ESF and the Free State of Saxony and by grants from the European Union{\textquoteright}s 7th Framework Programme provided for the project GeoKnow (GA no. 318159). Sincere thanks to Christiane Lemke Publisher Copyright: {\textcopyright} Springer International Publishing Switzerland 2014.; 11th European Semantic Web Symposium on Satellite Events, ESWC 2014 ; Conference date: 20-10-2014 Through 22-10-2014",
year = "2014",
doi = "10.1007/978-3-319-11955-7_2",
language = "English",
isbn = "978-3-319-11954-0",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Nature Switzerland AG",
pages = "14--24",
editor = "Anna Tordai and Eva Blomqvist and Harald Sack and Rapha{\"e}l Troncy and Valentina Presutti and Ioannis Papadakis",
booktitle = "The Semantic Web: ESWC 2014 Satellite Events",
address = "Switzerland",
url = "https://2014.eswc-conferences.org/index.html",

}

RIS

TY - CHAP

T1 - Lessons learned — The case of CROCUS

T2 - 11th European Semantic Web Symposium on Satellite Events, ESWC 2014

AU - Cherix, Didier

AU - Usbeck, Ricardo

AU - Both, Andreas

AU - Lehmann, Jens

N1 - This work has been partly supported by the ESF and the Free State of Saxony and by grants from the European Union’s 7th Framework Programme provided for the project GeoKnow (GA no. 318159). Sincere thanks to Christiane Lemke Publisher Copyright: © Springer International Publishing Switzerland 2014.

PY - 2014

Y1 - 2014

N2 - Over the past years, a vast number of datasets have been published based on Semantic Web standards, which provides an opportunity for creating novel industrial applications. However, industrial requirements on data quality are high while the time to market as well as the required costs for data preparation have to be kept low. Unfortunately, many Linked Data sources are error-prone which prevents their direct use in productive systems. Hence, (semi-)automatic quality assurance processes are needed as manual ontology repair procedures by domain experts are expensive and time consuming. In this article, we present CROCUS – a pipeline for cluster-based ontology data cleansing. Our system provides a semi-automatic approach for instance-level error detection in ontologies which is agnostic of the underlying Linked Data knowledge base and works at very low costs. CROCUS has been evaluated on two datasets. The experiments show that we are able to detect errors with high recall. Furthermore, we provide an exhaustive related work as well as a number of lessons learned.

AB - Over the past years, a vast number of datasets have been published based on Semantic Web standards, which provides an opportunity for creating novel industrial applications. However, industrial requirements on data quality are high while the time to market as well as the required costs for data preparation have to be kept low. Unfortunately, many Linked Data sources are error-prone which prevents their direct use in productive systems. Hence, (semi-)automatic quality assurance processes are needed as manual ontology repair procedures by domain experts are expensive and time consuming. In this article, we present CROCUS – a pipeline for cluster-based ontology data cleansing. Our system provides a semi-automatic approach for instance-level error detection in ontologies which is agnostic of the underlying Linked Data knowledge base and works at very low costs. CROCUS has been evaluated on two datasets. The experiments show that we are able to detect errors with high recall. Furthermore, we provide an exhaustive related work as well as a number of lessons learned.

KW - Informatics

KW - Business informatics

UR - http://www.scopus.com/inward/record.url?scp=84908670101&partnerID=8YFLogxK

UR - https://www.mendeley.com/catalogue/b0284657-a560-3764-8ca5-37e35d6e8ddc/

U2 - 10.1007/978-3-319-11955-7_2

DO - 10.1007/978-3-319-11955-7_2

M3 - Article in conference proceedings

AN - SCOPUS:84908670101

SN - 978-3-319-11954-0

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 14

EP - 24

BT - The Semantic Web: ESWC 2014 Satellite Events

A2 - Tordai, Anna

A2 - Blomqvist, Eva

A2 - Sack, Harald

A2 - Troncy, Raphaël

A2 - Presutti, Valentina

A2 - Papadakis, Ioannis

PB - Springer Nature Switzerland AG

Y2 - 20 October 2014 through 22 October 2014

ER -