Holistic and scalable ranking of RDF data

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

The volume and number of data sources published using Semantic Web standards such as RDF grows continuously. The largest of these data sources now contain billions of facts and are updated periodically. A large number of applications driven by such data sources requires the ranking of entities and facts contained in such knowledge graphs. Hence, there is a need for time-efficient approaches that can compute ranks for entities and facts simultaneously. In this paper, we present the first holistic ranking approach for RDF data. Our approach, dubbed HARE, allows the simultaneous computation of ranks for RDF triples, resources, properties and literals. To this end, HARE relies on the representation of RDF graphs as bi-partite graphs. It then employs a time-efficient extension of the random walk paradigm to bi-partite graphs. We show that by virtue of this extension, the worst-case complexity of HARE is O(n5) while that of PageRank is O(n6). In addition, we evaluate the practical efficiency of our approach by comparing it with PageRank on 6 real and 6 synthetic datasets with sizes up to 108 triples. Our results show that HARE is up to 2 orders of magnitude faster than PageRank. We also present a brief evaluation of HARE's ranking accuracy by comparing it with that of PageRank applied directly to RDF graphs. Our evaluation on 19 classes of DBpedia demonstrates that there is no statistical difference between HARE and PageRank. We hence conclude that our approach goes beyond the state of the art by allowing the ranking of all RDF entities and of RDF triples without being worse w.r.t. the ranking quality it achieves on resources. HARE is open-source and is available at http://github.com/dice-group/hare.

OriginalspracheEnglisch
TitelProceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
HerausgeberJian-Yun Nie, Zoran Obradovic, Toyotaro Suzumura, Rumi Ghosh, Raghunath Nambiar, Chonggang Wang, Hui Zang, Ricardo Baeza-Yates, Ricardo Baeza-Yates, Xiaohua Hu, Jeremy Kepner, Alfredo Cuzzocrea, Jian Tang, Masashi Toyoda
Anzahl der Seiten10
VerlagInstitute of Electrical and Electronics Engineers Inc.
Erscheinungsdatum01.07.2017
Seiten746-755
ISBN (Print)978-1-5386-2714-3, 978-1-5386-2716-7
ISBN (elektronisch)978-1-5386-2715-0
DOIs
PublikationsstatusErschienen - 01.07.2017
Extern publiziertJa
Veranstaltung5th IEEE International Conference on Big Data, Big Data 2017 - Boston, USA / Vereinigte Staaten
Dauer: 11.12.201714.12.2017
Konferenznummer: 5
https://cci.drexel.edu/bigdata/bigdata2017/

Bibliographische Notiz

Funding Information:
This work was supported by the H2020 project HOBBIT (GA no. 688227), the EuroStars projects DIESEL (E!9367) and QAMEL (E!9725) as well as the BMVI projects LIMBO (project no. 19F2029C) and OPAL (project no. 19F20284).

Publisher Copyright:
© 2017 IEEE.

DOI

Zuletzt angesehen

Publikationen

  1. How Much Home Office is Ideal? A Multi-Perspective Algorithm
  2. Model and Validation of the Electromagnetic Interference Produced by Power Transmission Lines in Robotic Systems
  3. Development of a Parameterized Model for Additively Manufactured Dies to Control the Strains in Extrudates
  4. Advances in Computer Science and Engineering
  5. Highly Efficient IPT Transmitter Circuit Based on a Novel Enhanced Class B Amplifier for Consumer Applications
  6. Biomedical Entity Linking with Triple-aware Pre-Training
  7. Anwendungsprogrammierung mit Embedded-SQL
  8. Methods in Writing Process Research
  9. Non-destructive transmissive inductive thickness sensor for IoT applications
  10. Composing with the terra fluida of interaction: new paths for CCO research as relational practice
  11. Doing statistics, enacting the nation
  12. Investigating Factors on R estorative Sleep Quality and its Relationship with Personal Work Performance - An Analysis of Diary Data
  13. Experimental Investigation of Efficiency and Deposit Process Temperature During Multi-Layer Friction Surfacing
  14. Secondary task as a measure of cognitive load
  15. The State of Multimedia Mass-Balance Modeling in Environmental science and decision-making
  16. Predicting recurrent chat contact in a psychological intervention for the youth using natural language processing
  17. Media Review: Extrapolations - A View from OS4F
  18. Combining Model Predictive and Adaptive Control for an Atomic Force Microscope Piezo-Scanner-Cantilever System
  19. Material system analysis
  20. The use of pseudo-causal narratives in EU policies
  21. Motivation for the Continuation of Work
  22. An optimal minimum phase approximating PD regulator for robust control of a throttle plate
  23. Agile Portfolio Management Patterns
  24. Investigating quality raters' performance using interface evaluation methods
  25. Introduction
  26. Enhancing the structural diversity between forest patches — A concept and real-world experiment to study biodiversity, multifunctionality and forest resilience across spatial scales
  27. Operation B
  28. Investigating the Promotional Effect of Green Signals in Sponsored Search Advertising Using Bayesian Parameter Estimation
  29. New Communications Technology in the Context of Interactive Sound Art
  30. A holistic view on security and collaboration in safe space