Lessons learned — The case of CROCUS: Cluster-based ontology data cleansing

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Over the past years, a vast number of datasets have been published based on Semantic Web standards, which provides an opportunity for creating novel industrial applications. However, industrial requirements on data quality are high while the time to market as well as the required costs for data preparation have to be kept low. Unfortunately, many Linked Data sources are error-prone which prevents their direct use in productive systems. Hence, (semi-)automatic quality assurance processes are needed as manual ontology repair procedures by domain experts are expensive and time consuming. In this article, we present CROCUS – a pipeline for cluster-based ontology data cleansing. Our system provides a semi-automatic approach for instance-level error detection in ontologies which is agnostic of the underlying Linked Data knowledge base and works at very low costs. CROCUS has been evaluated on two datasets. The experiments show that we are able to detect errors with high recall. Furthermore, we provide an exhaustive related work as well as a number of lessons learned.

Original languageEnglish
Title of host publicationThe Semantic Web: ESWC 2014 Satellite Events : ESWC 2014 Satellite Events, Anissaras, Crete, Greece, May 25-29, 2014
EditorsAnna Tordai, Eva Blomqvist, Harald Sack, Raphaël Troncy, Valentina Presutti, Ioannis Papadakis
Number of pages11
PublisherSpringer Nature Switzerland AG
Publication date2014
Pages14-24
ISBN (print)978-3-319-11954-0
ISBN (electronic)978-3-319-11955-7
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event11th European Semantic Web Symposium on Satellite Events, ESWC 2014 - Ouro Preto, Brazil
Duration: 20.10.201422.10.2014
https://2014.eswc-conferences.org/index.html

Bibliographical note

This work has been partly supported by the ESF and the Free State of Saxony and by grants from the European Union’s 7th Framework Programme provided for the project GeoKnow (GA no. 318159). Sincere thanks to Christiane Lemke

Publisher Copyright:
© Springer International Publishing Switzerland 2014.

Recently viewed

Publications

  1. Leveraging Big Data and Analytics for Auditing
  2. Modeling of microstructural pattern formation in crystal plasticity
  3. Enhancing the structural diversity between forest patches — A concept and real-world experiment to study biodiversity, multifunctionality and forest resilience across spatial scales
  4. The use of the online Inverted Classroom Model for digital teaching with gamification in medical studies
  5. DEVELOPMENT OF AN INTEGRATIVE LOGISTICS MODEL FOR LINKING PLANNING AND CONTROL TASKS WITH LOGISTICAL VARIABLES ALONG THE COMPANY'S INTERNAL SUPPLY CHAIN.
  6. To Own or to Use?
  7. A generalized α-level decomposition concept for numerical fuzzy calculus
  8. Maschinenbelegungsplanung mit evolutionären Algorithmen
  9. Overyielding in experimental grassland communities - Irrespective of species pool or spatial scale
  10. Ludic interfaces
  11. Development and prospects of degradable magnesium alloys for structural and functional applications in the fields of environment and energy
  12. Das relationale Apriori Wiens / Das städtische Apriori des Relationalismus
  13. Paradoxe Dynamik
  14. Using the Domestication Approach for the Analysis of Diffusion and Participation Processes of New Media
  15. Time for the Environment: The Tutzing Time Ecology Project
  16. Effects of tree diversity on canopy space occupation vary with tree size and canopy space definition in a mature broad-leaved forest
  17. Evidence for singlet state β cleavage in the photoreaction of α-(2,6-dimethoxyphenoxy)-acetophenone inferred from time-resolved CIDNP spectroscopy
  18. Induced Technological Change: Exploring its Implications for the Economics of Atmospheric Stabilization
  19. Building trust
  20. Low Resource Question Answering: An Amharic Benchmarking Dataset
  21. When Testing Becomes Learning—Underscoring the Relevance of Habituation to Improve Internal Validity of Common Neurocognitive Tests
  22. Hommage to the unknown viewers
  23. Glancing into the Applied Tool Box
  24. New incremental methods for springback compensation by stress superposition
  25. Feedforward and repetitive control of a servo piezo-mechanical hydraulic actuator
  26. The rise and decline of regional power
  27. Probing turbulent superstructures in Rayleigh-Bénard convection by Lagrangian trajectory clusters
  28. Modeling Interactions and Dependencies in Production Planning and Control