Web-scale extension of RDF knowledge bases from templated websites

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

  • Lorenz Bühmann
  • Ricardo Usbeck
  • Axel Cyrille Ngonga Ngomo
  • Muhammad Saleem
  • Andreas Both
  • Valter Crescenzi
  • Paolo Merialdo
  • Disheng Qiu

Only a small fraction of the information on the Web is represented as Linked Data. This lack of coverage is partly due to the paradigms followed so far to extract Linked Data.While converting structured data to RDF is well supported by tools, most approaches to extract RDF from semi-structured data rely on extraction methods based on ad-hoc solutions. In this paper, we present a holistic and open-source framework for the extraction of RDF from templated websites. We discuss the architecture of the framework and the initial implementation of each of its components. In particular, we present a novel wrapper induction technique that does not require any human supervision to detect wrappers for web sites. Our framework also includes a consistency layer with which the data extracted by the wrappers can be checked for logical consistency. We evaluate the initial version of REX on three different datasets. Our results clearly show the potential of using templated Web pages to extend the Linked Data Cloud. Moreover, our results indicate the weaknesses of our current implementations and how they can be extended.

Original languageEnglish
Title of host publicationThe SemanticWeb - ISWC 2014 - 13th International SemanticWeb Conference, Proceedings
EditorsTania Tudorache, Craig Knoblock, Paul Groth, Carole Goble, Chris Welty, Abraham Bernstein, Peter Mika, Denny Vrandečić, Natasha Noy, Krzysztof Janowicz
Number of pages16
PublisherSpringer Nature Switzerland AG
Publication date2014
Pages66-81
ISBN (print)978-3-319-11963-2
ISBN (electronic)978-3-319-11964-9
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event13th International Semantic Web Conference, ISWC 2014 - Riva del Garda, Italy
Duration: 19.10.201423.10.2014
Conference number: 13
https://search.worldcat.org/de/title/semantic-web-iswc-2014-13th-international-semantic-web-conference-riva-del-garda-italy-october-19-23-2014-proceedings-part-i/oclc/941304230

Bibliographical note

Publisher Copyright:
© Springer International Publishing Switzerland 2014.

Recently viewed

Publications

  1. Evidence on copula-based double-hurdle models with flexible margins
  2. Microstructure and corrosion of AZ91 with small amounts of cerium
  3. Comparing marginal effects between different models and/or samples
  4. The Influence Of Product Reuse On Production Planning and Control
  5. BUSINESS MODELS IN BANKING: A CLUSTER ANALYSIS USING ARCHIVAL DATA
  6. Dynamic control of internal force for visco-elastic contact grasps
  7. Distributable Modular Software Framework for Manufacturing Systems
  8. A welfare analysis of electricity transmission planning in Germany
  9. Das Erlernen digitaler Gesundheitskompetenz im schulischen Kontext
  10. Changeability of pre-service teachers’ beliefs about multilingualism
  11. Towards a Heuristic for Scheduling Offshore Installation Processes
  12. Introducing parametric uncertainty into a nonlinear friction model
  13. Pathways for Germany’s low-carbon energy transformation towards 2050
  14. Simulation of composite hot extrusion with high reinforcing Volumes
  15. Deep Rolling for Tailoring Residual Stresses of AA2024 Sheet Metals
  16. Predicate‐based model of problem‐solving for robotic actions planning
  17. Separating Cognitive and Content Domains in Mathematical Competence
  18. Neuere Ansätze des 'Verstehens' in der 'Historischen Bildungsforschung'
  19. Dynamic Inversion-Enhanced U-Control of Quadrotor Trajectory Tracking
  20. Microstructure, mechanical and corrosion properties of Mg-Gd-Zn alloys
  21. Make it your Break! Benefits of Person-Break Fit for Post-Break Affect
  22. Kompetenzorientiertes Fachwissen von Mathematik-Lehramtsstudierenden