Web-scale extension of RDF knowledge bases from templated websites

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

  • Lorenz Bühmann
  • Ricardo Usbeck
  • Axel Cyrille Ngonga Ngomo
  • Muhammad Saleem
  • Andreas Both
  • Valter Crescenzi
  • Paolo Merialdo
  • Disheng Qiu

Only a small fraction of the information on the Web is represented as Linked Data. This lack of coverage is partly due to the paradigms followed so far to extract Linked Data.While converting structured data to RDF is well supported by tools, most approaches to extract RDF from semi-structured data rely on extraction methods based on ad-hoc solutions. In this paper, we present a holistic and open-source framework for the extraction of RDF from templated websites. We discuss the architecture of the framework and the initial implementation of each of its components. In particular, we present a novel wrapper induction technique that does not require any human supervision to detect wrappers for web sites. Our framework also includes a consistency layer with which the data extracted by the wrappers can be checked for logical consistency. We evaluate the initial version of REX on three different datasets. Our results clearly show the potential of using templated Web pages to extend the Linked Data Cloud. Moreover, our results indicate the weaknesses of our current implementations and how they can be extended.

Original languageEnglish
Title of host publicationThe SemanticWeb - ISWC 2014 - 13th International SemanticWeb Conference, Proceedings
EditorsTania Tudorache, Craig Knoblock, Paul Groth, Carole Goble, Chris Welty, Abraham Bernstein, Peter Mika, Denny Vrandečić, Natasha Noy, Krzysztof Janowicz
Number of pages16
PublisherSpringer Nature Switzerland AG
Publication date2014
Pages66-81
ISBN (print)978-3-319-11963-2
ISBN (electronic)978-3-319-11964-9
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event13th International Semantic Web Conference, ISWC 2014 - Riva del Garda, Italy
Duration: 19.10.201423.10.2014
Conference number: 13
https://search.worldcat.org/de/title/semantic-web-iswc-2014-13th-international-semantic-web-conference-riva-del-garda-italy-october-19-23-2014-proceedings-part-i/oclc/941304230

Bibliographical note

Publisher Copyright:
© Springer International Publishing Switzerland 2014.

Recently viewed

Publications

  1. Clause identification using entropy guided transformation learning
  2. Experimentally established correlation of friction surfacing process temperature and deposit geometry
  3. Interpreting Strings, Weaving Threads
  4. Generating Energy Optimal Powertrain Force Trajectories with Dynamic Constraints
  5. Analyzing math teacher students' sensitivity for aspects of the complexity of problem oriented mathematics instruction
  6. FaST: A linear time stack trace alignment heuristic for crash report deduplication
  7. What does it mean to be sensitive for the complexity of (problem oriented) teaching?
  8. Improving students’ science text comprehension through metacognitive self-regulation when applying learning strategies
  9. A new way of assessing the interaction of a metallic phase precursor with a modified oxide support substrate as a source of information for predicting metal dispersion
  10. Computing regression statistics from grouped data
  11. Performance analysis for loss systems with many subscribers and concurrent services
  12. Stimulating Computing
  13. TARGET SETTING FOR OPERATIONAL PERFORMANCE IMPROVEMENTS - STUDY CASE -
  14. Integration of laser scanning and projection speckle pattern for advanced pipeline monitoring
  15. Comments on "Tracking Control of Robotic Manipulators With Uncertain Kinematics and Dynamics"
  16. Analysis of long-term statistical data of cobalt flows in the EU
  17. Simulation based optimization of lot sizes for opposing logistic objectives
  18. Gaussian processes for dispatching rule selection in production scheduling
  19. Exploring the limits of graph invariant- and spectrum-based discrimination of (sub)structures.
  20. Learning Analytics with Matlab Grader in Undergraduate Engineering Courses
  21. Neural network-based estimation and compensation of friction for enhanced deep drawing process control
  22. Teaching methods for modelling problems and students’ task-specific enjoyment, value, interest and self-efficacy expectations
  23. Self-regulation in error management training: emotion control and metacognition as mediators of performance effects
  24. Does thinking-aloud affect learning, visual information processing and cognitive load when learning with seductive details as expected from self-regulation perspective?
  25. For a return to the forgotten formula: 'Data 1 + Data 2 > Data 1'
  26. Using Language Learning Resources on YouTube
  27. Cognitive Predictors of Child Second Language Comprehension and Syntactic Learning
  28. A Theoretical Dynamical Noninteracting Model for General Manipulation Systems Using Axiomatic Geometric Structures
  29. Teachers’ use of data from digital learning platforms for instructional design
  30. Dynamic environment modelling and prediction for autonomous systems
  31. Machine Learning and Knowledge Discovery in Databases
  32. Multiphase-field modeling of temperature-driven intermetallic compound evolution in an Al-Mg system for application to solid-state joining processes
  33. Guided discovery learning with computer-based simulation games
  34. Modelling biodegradability based on OECD 301D data for the design of mineralising ionic liquids
  35. A longitudinal multilevel CFA-MTMM model for interchangeable and structurally different methods