Web-scale extension of RDF knowledge bases from templated websites

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

  • Lorenz Bühmann
  • Ricardo Usbeck
  • Axel Cyrille Ngonga Ngomo
  • Muhammad Saleem
  • Andreas Both
  • Valter Crescenzi
  • Paolo Merialdo
  • Disheng Qiu

Only a small fraction of the information on the Web is represented as Linked Data. This lack of coverage is partly due to the paradigms followed so far to extract Linked Data.While converting structured data to RDF is well supported by tools, most approaches to extract RDF from semi-structured data rely on extraction methods based on ad-hoc solutions. In this paper, we present a holistic and open-source framework for the extraction of RDF from templated websites. We discuss the architecture of the framework and the initial implementation of each of its components. In particular, we present a novel wrapper induction technique that does not require any human supervision to detect wrappers for web sites. Our framework also includes a consistency layer with which the data extracted by the wrappers can be checked for logical consistency. We evaluate the initial version of REX on three different datasets. Our results clearly show the potential of using templated Web pages to extend the Linked Data Cloud. Moreover, our results indicate the weaknesses of our current implementations and how they can be extended.

OriginalspracheEnglisch
TitelThe SemanticWeb - ISWC 2014 - 13th International SemanticWeb Conference, Proceedings
HerausgeberTania Tudorache, Craig Knoblock, Paul Groth, Carole Goble, Chris Welty, Abraham Bernstein, Peter Mika, Denny Vrandečić, Natasha Noy, Krzysztof Janowicz
Anzahl der Seiten16
VerlagSpringer Nature Switzerland AG
Erscheinungsdatum2014
Seiten66-81
ISBN (Print)978-3-319-11963-2
ISBN (elektronisch)978-3-319-11964-9
DOIs
PublikationsstatusErschienen - 2014
Extern publiziertJa
Veranstaltung13th International Semantic Web Conference, ISWC 2014 - Riva del Garda, Italien
Dauer: 19.10.201423.10.2014
Konferenznummer: 13
https://search.worldcat.org/de/title/semantic-web-iswc-2014-13th-international-semantic-web-conference-riva-del-garda-italy-october-19-23-2014-proceedings-part-i/oclc/941304230

Bibliographische Notiz

Publisher Copyright:
© Springer International Publishing Switzerland 2014.

DOI

Zuletzt angesehen

Publikationen

  1. Formative Perspectives on the Relation Between CSR Communication and CSR Practices
  2. Study of fuzzy controllers performance
  3. Learning from partially annotated sequences
  4. Active learning for network intrusion detection
  5. Global Finite-Time Stabilization of Planar Linear Systems With Actuator Saturation
  6. A Lyapunov based PI controller with an anti-windup scheme for a purification process of potable water
  7. Embarrassment as a public vs. private emotion and symbolic coping behaviour
  8. The Creation of the Concept through the Interaction of Philosophy with Science and Art
  9. Strategies of postural control in static and in dynamic testing situations
  10. Design of an Information-Based Distributed Production Planning System
  11. Understanding and Supporting Management Decision-Making
  12. Topic selection and development in learner-native speaker voice-based telecollaborative discourse
  13. Adaptive control of the nonlinear dynamic behavior of the cantilever-sample system of an atomic force microscope
  14. Transductive support vector machines for structured variables
  15. Exploring transition research as transformative science
  16. »HOW TO MAKE YOUR OWN SAMPLES«
  17. Performance of process-based models for simulation of grain N in crop rotations across Europe
  18. Aspect-oriented software development
  19. Learning shortest paths in word graphs
  20. Distributable Modular Software Framework for Manufacturing Systems
  21. Measuring Learning Styles with Questionnaires Versus Direct Observation of Preferential Choice Behavior in Authentic Learning Situations
  22. Oddih
  23. “Ideation is Fine, but Execution is Key”
  24. Towards a spatial understanding of identity play
  25. Resolving the Complexity-Flexibility Dilemma in Multi-Issue Negotiations: Nested Bracketing as a Strategy to Enhance Negotiation Outcomes
  26. Developing a sustainable platform for entity annotation benchmarks
  27. Machine Learning For Determining Planned Order Lead Times In Job Shop Production: A Systematic Review Of Input Factors And Applied Methods
  28. Foreign bias in institutional portfolio allocation