Web-scale extension of RDF knowledge bases from templated websites

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

  • Lorenz Bühmann
  • Ricardo Usbeck
  • Axel Cyrille Ngonga Ngomo
  • Muhammad Saleem
  • Andreas Both
  • Valter Crescenzi
  • Paolo Merialdo
  • Disheng Qiu

Only a small fraction of the information on the Web is represented as Linked Data. This lack of coverage is partly due to the paradigms followed so far to extract Linked Data.While converting structured data to RDF is well supported by tools, most approaches to extract RDF from semi-structured data rely on extraction methods based on ad-hoc solutions. In this paper, we present a holistic and open-source framework for the extraction of RDF from templated websites. We discuss the architecture of the framework and the initial implementation of each of its components. In particular, we present a novel wrapper induction technique that does not require any human supervision to detect wrappers for web sites. Our framework also includes a consistency layer with which the data extracted by the wrappers can be checked for logical consistency. We evaluate the initial version of REX on three different datasets. Our results clearly show the potential of using templated Web pages to extend the Linked Data Cloud. Moreover, our results indicate the weaknesses of our current implementations and how they can be extended.

Original languageEnglish
Title of host publicationThe SemanticWeb - ISWC 2014 - 13th International SemanticWeb Conference, Proceedings
EditorsTania Tudorache, Craig Knoblock, Paul Groth, Carole Goble, Chris Welty, Abraham Bernstein, Peter Mika, Denny Vrandečić, Natasha Noy, Krzysztof Janowicz
Number of pages16
PublisherSpringer Nature Switzerland AG
Publication date2014
Pages66-81
ISBN (print)978-3-319-11963-2
ISBN (electronic)978-3-319-11964-9
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event13th International Semantic Web Conference, ISWC 2014 - Riva del Garda, Italy
Duration: 19.10.201423.10.2014
Conference number: 13
https://search.worldcat.org/de/title/semantic-web-iswc-2014-13th-international-semantic-web-conference-riva-del-garda-italy-october-19-23-2014-proceedings-part-i/oclc/941304230

Bibliographical note

Publisher Copyright:
© Springer International Publishing Switzerland 2014.

Recently viewed

Publications

  1. Dispatching rule selection with Gaussian processes
  2. Homogenization methods for multi-phase elastic composites with non-elliptical reinforcements
  3. Towards a Bayesian Student Model for Detecting Decimal Misconceptions
  4. Foundations and applications of computer based material flow networks for einvironmental management
  5. Artificial Intelligence Algorithms for Collaborative Book Recommender Systems
  6. Learning from Erroneous Examples: When and How do Students Benefit from them?
  7. Study on the effects of tool design and process parameters on the robustness of deep drawing
  8. Adjustable automation and manoeuvre control in automated driving
  9. Backstepping-based Input-Output Linearization of a Peltier Element for Ice Clamping using an Unscented Kalman Filter
  10. Situated multiplying in primary school
  11. Oddih
  12. Performance of process-based models for simulation of grain N in crop rotations across Europe
  13. Passive Rotation of Rotational Joints and Its Computation Method
  14. Exploiting ConvNet diversity for flooding identification
  15. Denoising and harmonic detection using nonorthogonal wavelet packets in industrial applications
  16. Modellieren in der Sekundarstufe
  17. Making mutual learning tangible
  18. The effect of yield surface curvature change by cross hardening on forming limit diagrams of sheets
  19. Challenges for postdocs in Germany and beyond:
  20. Sustainable Consumption - Mapping the Terrain
  21. Implementing aspects of inquiry-based learning in secondary chemistry classes: a case study
  22. Integrating resilience thinking and optimisation for conservation
  23. An Integrative Framework of Environmental Management Accounting
  24. A robust model predictive control using a feedforward structure for a hybrid hydraulic piezo actuator in camless internal combustion engines
  25. Comparative study on the dehydrogenation properties of TiCl4-doped LiAlH4 using different doping techniques
  26. Evaluating a Bayesian Student Model of Decimal Misconceptions
  27. Design of Reliable Remobilisation Finger Implants with Geometry Elements of a Triple Periodic Minimal Surface Structure via Additive Manufacturing of Silicon Nitride
  28. Spectral Early-Warning Signals for Sudden Changes in Time-Dependent Flow Patterns
  29. Effect of gap distortion on the field splitting of collective modes in superfluid He3-B
  30. Formative assessment in inclusive mathematics education in secondary schools