Web-scale extension of RDF knowledge bases from templated websites

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

  • Lorenz Bühmann
  • Ricardo Usbeck
  • Axel Cyrille Ngonga Ngomo
  • Muhammad Saleem
  • Andreas Both
  • Valter Crescenzi
  • Paolo Merialdo
  • Disheng Qiu

Only a small fraction of the information on the Web is represented as Linked Data. This lack of coverage is partly due to the paradigms followed so far to extract Linked Data.While converting structured data to RDF is well supported by tools, most approaches to extract RDF from semi-structured data rely on extraction methods based on ad-hoc solutions. In this paper, we present a holistic and open-source framework for the extraction of RDF from templated websites. We discuss the architecture of the framework and the initial implementation of each of its components. In particular, we present a novel wrapper induction technique that does not require any human supervision to detect wrappers for web sites. Our framework also includes a consistency layer with which the data extracted by the wrappers can be checked for logical consistency. We evaluate the initial version of REX on three different datasets. Our results clearly show the potential of using templated Web pages to extend the Linked Data Cloud. Moreover, our results indicate the weaknesses of our current implementations and how they can be extended.

Original languageEnglish
Title of host publicationThe SemanticWeb - ISWC 2014 - 13th International SemanticWeb Conference, Proceedings
EditorsTania Tudorache, Craig Knoblock, Paul Groth, Carole Goble, Chris Welty, Abraham Bernstein, Peter Mika, Denny Vrandečić, Natasha Noy, Krzysztof Janowicz
Number of pages16
PublisherSpringer Nature Switzerland AG
Publication date2014
Pages66-81
ISBN (print)978-3-319-11963-2
ISBN (electronic)978-3-319-11964-9
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event13th International Semantic Web Conference, ISWC 2014 - Riva del Garda, Italy
Duration: 19.10.201423.10.2014
Conference number: 13
https://search.worldcat.org/de/title/semantic-web-iswc-2014-13th-international-semantic-web-conference-riva-del-garda-italy-october-19-23-2014-proceedings-part-i/oclc/941304230

Bibliographical note

Publisher Copyright:
© Springer International Publishing Switzerland 2014.

Recently viewed

Publications

  1. A Service-oriented Search framework for full text, geospatial and semantic search
  2. Homogenization methods for multi-phase elastic composites with non-elliptical reinforcements
  3. Universal Threshold Calculation for Fingerprinting Decoders using Mixture Models
  4. FaST: A linear time stack trace alignment heuristic for crash report deduplication
  5. Considerations on efficient touch interfaces - How display size influences the performance in an applied pointing task
  6. Computing regression statistics from grouped data
  7. On the Decoupling and Output Functional Controllability of Robotic Manipulation
  8. Mapping interest rate projections using neural networks under cointegration
  9. Partitioned beta diversity patterns of plants across sharp and distinct boundaries of quartz habitat islands
  10. Analysis of PI controllers with anti-windup techniques on level systems
  11. Study on the effects of tool design and process parameters on the robustness of deep drawing
  12. TRY plant trait database – enhanced coverage and open access
  13. An evaluation of BPR methodologies adopting NIMSAD: A systematic framework for understanding and evaluating methodologies
  14. On finding nonisomorphic connected subgraphs and distinct molecular substructures.
  15. 7th open challenge on question answering over linked data (QALD-7)
  16. An expert-based reference list of variables for characterizing and monitoring social-ecological systems
  17. A Review of Latent Variable Modeling Using R - A Step-by-Step-Guide
  18. Practical guide to SAP Netweaver PI-development
  19. Modelling and implementation of an Order2Cash Process in distributed systems
  20. Knowledge-Enhanced Language Models Are Not Bias-Proof
  21. Mechanistic Realization of the Turtle Shell
  22. An Orthogonal Wavelet Denoising Algorithm for Surface Images of Atomic Force Microscopy
  23. Performance concepts and performance theory
  24. A Multilevel Inverter Bridge Control Structure with Energy Storage Using Model Predictive Control for Flat Systems
  25. Mirrored piezo servo hydraulic actuators for use in camless combustion engines and its Control with mirrored inputs and MPC
  26. Neural network-based estimation and compensation of friction for enhanced deep drawing process control
  27. Data-driven and physics-based modelling of process behaviour and deposit geometry for friction surfacing
  28. Changes of Perception
  29. Spaces for challenging experiences, indeterminacy, and experimentation
  30. For a return to the forgotten formula: 'Data 1 + Data 2 > Data 1'
  31. Errors in Training Computer Skills