Holistic and scalable ranking of RDF data

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

The volume and number of data sources published using Semantic Web standards such as RDF grows continuously. The largest of these data sources now contain billions of facts and are updated periodically. A large number of applications driven by such data sources requires the ranking of entities and facts contained in such knowledge graphs. Hence, there is a need for time-efficient approaches that can compute ranks for entities and facts simultaneously. In this paper, we present the first holistic ranking approach for RDF data. Our approach, dubbed HARE, allows the simultaneous computation of ranks for RDF triples, resources, properties and literals. To this end, HARE relies on the representation of RDF graphs as bi-partite graphs. It then employs a time-efficient extension of the random walk paradigm to bi-partite graphs. We show that by virtue of this extension, the worst-case complexity of HARE is O(n5) while that of PageRank is O(n6). In addition, we evaluate the practical efficiency of our approach by comparing it with PageRank on 6 real and 6 synthetic datasets with sizes up to 108 triples. Our results show that HARE is up to 2 orders of magnitude faster than PageRank. We also present a brief evaluation of HARE's ranking accuracy by comparing it with that of PageRank applied directly to RDF graphs. Our evaluation on 19 classes of DBpedia demonstrates that there is no statistical difference between HARE and PageRank. We hence conclude that our approach goes beyond the state of the art by allowing the ranking of all RDF entities and of RDF triples without being worse w.r.t. the ranking quality it achieves on resources. HARE is open-source and is available at http://github.com/dice-group/hare.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
EditorsJian-Yun Nie, Zoran Obradovic, Toyotaro Suzumura, Rumi Ghosh, Raghunath Nambiar, Chonggang Wang, Hui Zang, Ricardo Baeza-Yates, Ricardo Baeza-Yates, Xiaohua Hu, Jeremy Kepner, Alfredo Cuzzocrea, Jian Tang, Masashi Toyoda
Number of pages10
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date01.07.2017
Pages746-755
ISBN (print)978-1-5386-2714-3, 978-1-5386-2716-7
ISBN (electronic)978-1-5386-2715-0
DOIs
Publication statusPublished - 01.07.2017
Externally publishedYes
Event5th IEEE International Conference on Big Data, Big Data 2017 - Boston, United States
Duration: 11.12.201714.12.2017
Conference number: 5
https://cci.drexel.edu/bigdata/bigdata2017/

Bibliographical note

Publisher Copyright:
© 2017 IEEE.

Recently viewed

Publications

  1. Multi-view discriminative sequential learning
  2. The impact of goal focus, task type and group size on synchronous net-based collaborative learning discourses
  3. Mathematics in Robot Control for Theoretical and Applied Problems
  4. Interaction-Dominant Causation in Mind and Brain, and Its Implication for Questions of Generalization and Replication
  5. Soil conditions modify species diversity effects on tree functional trait expression
  6. On the Inclusion of Parameter Uncertainties into Engineering Design Computations
  7. Soft Optimal Computing to Identify Surface Roughness in Manufacturing Using a Gaussian and a Trigonometric Regressor
  8. A Hybrid Actuator and its Control Using a Cascade Sliding Mode Technique
  9. Enacting migration through data practices
  10. Self-regulation in error management training: emotion control and metacognition as mediators of performance effects
  11. Lyapunov approach for a pi-controller with anti-windup in a permanent magnet synchronous motor using chopper control
  12. Mechanical characterization of as-cast AA7075/6060 and CuSn6/Cu99.5 compounds using an experimental and numerical push-out test
  13. Comparison of EKF and TSO for Health Monitoring of a Textile-Based Heater Structure and its Control
  14. The structure of emotions in learning situations
  15. Crises at Work: Potentials for Change?
  16. An Optimal and Stabilising PI Controller with an Anti-windup Scheme for a Purification Process of Potable Water
  17. Direct parameter specification of an attention shift: Evidence from perceptual latency priming
  18. Automatic generation of periodic representative volume elements for matrix-inclusion composites and their efficiency in multiscaling
  19. Confidence levels and likelihood terms in IPCC reports
  20. How many organic compounds are graph-theoretically nonplanar?
  21. Eulerian and Lagrangian perspectives on turbulent superstructures in Rayleigh-Bénard convection
  22. Smart Multi-coil Inductive Power Tranmission with IoT Based Visulization
  23. The generative drawing principle in multimedia learning
  24. Developing ESD-specific professional action competence for teachers: knowledge, skills, and attitudes in implementing ESD at the school level
  25. Recontextualizing context
  26. Development of a Mobile Application for People with Panic Disorder as augmentation for an Internet-based Intervention
  27. Investigations on hot tearing of Mg-Al binary alloys by using a new quantitative method
  28. Training in Components of Problem-Solving Competence
  29. The Crowd in Flux
  30. Safer Spaces
  31. Nonlinear analyses of self-paced reading