Holistic and scalable ranking of RDF data

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

The volume and number of data sources published using Semantic Web standards such as RDF grows continuously. The largest of these data sources now contain billions of facts and are updated periodically. A large number of applications driven by such data sources requires the ranking of entities and facts contained in such knowledge graphs. Hence, there is a need for time-efficient approaches that can compute ranks for entities and facts simultaneously. In this paper, we present the first holistic ranking approach for RDF data. Our approach, dubbed HARE, allows the simultaneous computation of ranks for RDF triples, resources, properties and literals. To this end, HARE relies on the representation of RDF graphs as bi-partite graphs. It then employs a time-efficient extension of the random walk paradigm to bi-partite graphs. We show that by virtue of this extension, the worst-case complexity of HARE is O(n5) while that of PageRank is O(n6). In addition, we evaluate the practical efficiency of our approach by comparing it with PageRank on 6 real and 6 synthetic datasets with sizes up to 108 triples. Our results show that HARE is up to 2 orders of magnitude faster than PageRank. We also present a brief evaluation of HARE's ranking accuracy by comparing it with that of PageRank applied directly to RDF graphs. Our evaluation on 19 classes of DBpedia demonstrates that there is no statistical difference between HARE and PageRank. We hence conclude that our approach goes beyond the state of the art by allowing the ranking of all RDF entities and of RDF triples without being worse w.r.t. the ranking quality it achieves on resources. HARE is open-source and is available at http://github.com/dice-group/hare.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
EditorsJian-Yun Nie, Zoran Obradovic, Toyotaro Suzumura, Rumi Ghosh, Raghunath Nambiar, Chonggang Wang, Hui Zang, Ricardo Baeza-Yates, Ricardo Baeza-Yates, Xiaohua Hu, Jeremy Kepner, Alfredo Cuzzocrea, Jian Tang, Masashi Toyoda
Number of pages10
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date01.07.2017
Pages746-755
ISBN (print)978-1-5386-2714-3, 978-1-5386-2716-7
ISBN (electronic)978-1-5386-2715-0
DOIs
Publication statusPublished - 01.07.2017
Externally publishedYes
Event5th IEEE International Conference on Big Data, Big Data 2017 - Boston, United States
Duration: 11.12.201714.12.2017
Conference number: 5
https://cci.drexel.edu/bigdata/bigdata2017/

Bibliographical note

Publisher Copyright:
© 2017 IEEE.

Recently viewed

Publications

  1. Towards a spatial understanding of identity play
  2. Global Finite-Time Stabilization of Planar Linear Systems With Actuator Saturation
  3. Noise level estimation and detection
  4. Interpreting Strings, Weaving Threads
  5. Robust Flatness Based Control of an Electromagnetic Linear Actuator Using Adaptive PID Controller
  6. Investigation and modeling of the material behavior due to evolving dislocation microstructures in fcc and bcc metals
  7. Understanding storytelling in the context of information systems
  8. Analyzing math teacher students' sensitivity for aspects of the complexity of problem oriented mathematics instruction
  9. Real-time RDF extraction from unstructured data streams
  10. “Ideation is Fine, but Execution is Key”
  11. Supporting the Development and Realization of Data-Driven Business Models with Enterprise Architecture Modeling and Management
  12. Considerations on efficient touch interfaces - How display size influences the performance in an applied pointing task
  13. A new way of assessing the interaction of a metallic phase precursor with a modified oxide support substrate as a source of information for predicting metal dispersion
  14. Computing regression statistics from grouped data
  15. Foundations and applications of computer based material flow networks for einvironmental management
  16. Mapping interest rate projections using neural networks under cointegration
  17. Partitioned beta diversity patterns of plants across sharp and distinct boundaries of quartz habitat islands
  18. Analysis of PI controllers with anti-windup techniques on level systems
  19. Using Fuzzy PD Controllers for Soft Motions in a Car-like Robot
  20. An expert-based reference list of variables for characterizing and monitoring social-ecological systems
  21. The fuzzy relationship of intelligence and problem solving in computer simulations
  22. Neural network-based estimation and compensation of friction for enhanced deep drawing process control
  23. Resolving the Complexity-Flexibility Dilemma in Multi-Issue Negotiations: Nested Bracketing as a Strategy to Enhance Negotiation Outcomes
  24. Changes of Perception
  25. Self-regulation in error management training: emotion control and metacognition as mediators of performance effects
  26. Resource extraction technologies - is a more responsible path of development possible?
  27. GENESIS - A generic RDF data access interface
  28. In-Vehicle Sensor System for Monitoring Efficiency of Vehicle E/E Architectures
  29. Semantic Evaluation Services for Web-Based Exercises
  30. Emergency detection based on probabilistic modeling in AAL-environments
  31. Functional Richness and Relative Resilience of Bird Communities in Regions with Different Land Use Intensities
  32. Dimension estimates for certain sets of infinite complex continued fractions
  33. The effects of different on-line adaptive response time limits on speed and amount of learning in computer assisted instruction and intelligent tutoring