Web-scale extension of RDF knowledge bases from templated websites

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

  • Lorenz Bühmann
  • Ricardo Usbeck
  • Axel Cyrille Ngonga Ngomo
  • Muhammad Saleem
  • Andreas Both
  • Valter Crescenzi
  • Paolo Merialdo
  • Disheng Qiu

Only a small fraction of the information on the Web is represented as Linked Data. This lack of coverage is partly due to the paradigms followed so far to extract Linked Data.While converting structured data to RDF is well supported by tools, most approaches to extract RDF from semi-structured data rely on extraction methods based on ad-hoc solutions. In this paper, we present a holistic and open-source framework for the extraction of RDF from templated websites. We discuss the architecture of the framework and the initial implementation of each of its components. In particular, we present a novel wrapper induction technique that does not require any human supervision to detect wrappers for web sites. Our framework also includes a consistency layer with which the data extracted by the wrappers can be checked for logical consistency. We evaluate the initial version of REX on three different datasets. Our results clearly show the potential of using templated Web pages to extend the Linked Data Cloud. Moreover, our results indicate the weaknesses of our current implementations and how they can be extended.

OriginalspracheEnglisch
TitelThe SemanticWeb - ISWC 2014 - 13th International SemanticWeb Conference, Proceedings
HerausgeberTania Tudorache, Craig Knoblock, Paul Groth, Carole Goble, Chris Welty, Abraham Bernstein, Peter Mika, Denny Vrandečić, Natasha Noy, Krzysztof Janowicz
Anzahl der Seiten16
VerlagSpringer Nature Switzerland AG
Erscheinungsdatum2014
Seiten66-81
ISBN (Print)978-3-319-11963-2
ISBN (elektronisch)978-3-319-11964-9
DOIs
PublikationsstatusErschienen - 2014
Extern publiziertJa
Veranstaltung13th International Semantic Web Conference, ISWC 2014 - Riva del Garda, Italien
Dauer: 19.10.201423.10.2014
Konferenznummer: 13
https://search.worldcat.org/de/title/semantic-web-iswc-2014-13th-international-semantic-web-conference-riva-del-garda-italy-october-19-23-2014-proceedings-part-i/oclc/941304230

Bibliographische Notiz

Publisher Copyright:
© Springer International Publishing Switzerland 2014.

DOI

Zuletzt angesehen

Aktivitäten

  1. Monkey Business: Who Pulls the Strings? 2013
  2. Intelligent software system for replacing a force sensor in the case of clearance measurement
  3. The influence of polycentricity on collaborative environmental management – the case of EU Water Framework Directive implementation in Germany
  4. Using the Method of Limits to Assess Comfortable Time Headways in Adaptive Cruise Control
  5. Placemaking today: integrating place-oriented thinking into cultural policy frameworks
  6. Undoing Ethnographic and Archaological Objects
  7. Removal of Methotrexate, 5-Fluorouracil and Cyclophosphamide from water by UV, UV/H2O2 and UV/Fe2+/H2O2 processe
  8. Spec­tral Ki­ne­tic Si­mu­la­ti­on of Ideal Mul­ti­po­le Re­so­nan­ce Probe
  9. Reflexive Multi-Criteria Evaluation as a tool to integrate Multiple Values into Decision-Making – a Case Study from Germany
  10. The Relation of Children's Performances in Spatial Tasks at Two Different Scales of Space
  11. Bi-annual General Assembly of the World Values Survey Association - WVS 2014
  12. Correlates of Work Design and the Intention to Continue Work in Retirement
  13. All-Affected, Non-Identity and the Political Representation of Future Generations
  14. Symbolic Environmental Legislation and Societal Self-Deception: The Societal, Technical and Environmental Context
  15. Identification of photo-transformation products of ciprofloxacin and evaluation of their genotoxicity using in silco methods and in vitro assay
  16. Lecturer for the course "Mathematics & Statistics“
  17. 2013 5th International Conference on Modelling, Identification and Control - ICMIC 2013
  18. Journal of Molecular Catalysis A (Zeitschrift)