FAIR.ReD: Semantic knowledge graph infrastructure for the life sciences

Research output: Journal contributionsConference abstract in journalResearch

Authors

  • Lars Vogt
  • Sören Auer
  • Thomas Bartolomaeus
  • Pier Luigi Buttigieg
  • Peter Grobe
  • Peter Michalik
  • Markus Stocker
  • Ricardo Usbeck
We would like to present FAIR Research Data: Semantic Knowledge Graph Infrastructure for the Life Sciences (in short, FAIR.ReD), a project initiative that is currently being evaluated for funding. FAIR.ReD is a software environment for developing data management solutions according to the FAIR (Findable, Accessible, Interoperable, Reusable; Wilkinson et al. 2016) data principles. It utilizes what we call a Data Sea Storage, which employs the idea of Data Lakes to decouple data storage from data access but modifies it by storing data in a semantically structured format as either semantic graphs or semantic tables, instead of storing them in their native form. Storage follows a top-down approach, resulting in a standardized storage model, which allows sharing data across all FAIR.ReD Knowledge Graph Applications (KGAs) connected to the same Sea, with newly developed KGAs having automatically access to all contents in the Sea. In contrast access and export of data follows a bottom-up approach that allows the specification of additional data models to meet the varying domain-specific and programmatic needs for accessing structured data. The FAIR.ReD engine enables bidirectional data conversion between the two storage models and any additional data model, which will substantially reduce conversion workload for data-rich institutes (Fig. 1). Moreover, with the possibility to store data in semantic tables, FAIR.ReD provides high performance storage for incoming data streams such as sensory data. FAIR.ReD KGAs are modularly organized. Modules can be edited using the FAIR.ReD editor and combined to form coherent KGAs. The editor allows domain experts to develop their own modules and KGAs without any programming experience required, thus also allowing smaller projects and individual researchers to build their own FAIR data management solution.Contents from FAIR.ReD KGAs can be published under a Creative Commons license as documents, micropublications, or nanopublications, each receiving their own DOI. A publication-life-cycle is implemented in FAIR.ReD and allows updating published contents for corrections or additions without overwriting the originally published version. Together with the fact that data and metadata are semantically structured and machine-readable, all contents from FAIR.ReD KGAs will comply with the FAIR Guiding Principles. Due to all FAIR.Red KGAs providing access to semantic knowledge graphs in both a human-readable and a machine-readable version, FAIR.ReD seamlessly integrates the complex RDF (Resource Description Framework) world with a more intuitively comprehensible presentation of data in form of data entry forms, charts, and tables.Guided by use cases, the FAIR.ReD environment will be developed using semantic programming where the source code of an application is stored in its own ontology. The set of source code ontologies of a KGA and its modules provides the steering logic for running the KGA. With this clear separation of steering logic from interpretation logic, semantic programming follows the idea of separating main layers of an application, analog to the separation of interpretation logic and presentation logic. Each KGA and module is specified exactly in this way and their source code ontologies stored in the Data Sea. Thus, all data and metadata are semantically transparent and so is the data management application itself, which substantially improves their sustainability on all levels of data processing and storing.
Original languageEnglish
Article numbere37206
JournalBiodiversity Information Science and Standards
Volume3
Number of pages3
ISSN2535-0897
DOIs
Publication statusPublished - 19.06.2019
Externally publishedYes

DOI

Recently viewed

Publications

  1. No time for smokescreen skepticism
  2. Impact Assessment of Emissions Stabilization Scenarios with and without Induced Technological Change
  3. Strategic Spatial Planning
  4. Ästhetikkolumne
  5. Investigation on the Microstructure and Mechanical Properties of Mg–Gd–Nd Ternary Alloys
  6. Biopolitical Interventions in the Urban Data Space
  7. Protocol
  8. Skill learning as a concept in life-span developmental psychology
  9. Lautheitskonstanz oder Range-Effekt?
  10. Theatre and Engineering
  11. Mechanical and corrosion properties of as-cast and extruded MG10GD alloy for biomedical application
  12. Schreiben Englisch
  13. Externalisierung
  14. The impact of auditor rotation, audit firm rotation and non-audit services on earnings quality, audit quality and investor perceptions: A literature review
  15. Long-Term Abandonment of Forest Management Has a Strong Impact on Tree Morphology and Wood Volume Allocation Pattern of European Beech (Fagus Sylvatica L.)
  16. Effects of preactivated mental representations on driving performance
  17. Von „effective control“ zu „contactless control“?
  18. Joe Lederer: Das Mädchen George
  19. Leaf Attenuated Total Reflection Fourier Transform Infrared (ATR-FTIR) biochemical profile of grassland plant species related to land-use intensity
  20. Endogenous environmental policy for small open economies with transboundary pollution
  21. The Connected Classroom
  22. Words and deeds
  23. Trade Dynamics, Trade Costs and Market Size: First Evidence from the Exporter and Importer Dynamics Database for Germany
  24. Where Paintings Live
  25. Psychophysiological Correlates of Flow-Experience
  26. On the Effectiveness of Triply-Periodic Minimal Surface Structures for Heat Sinks Used in Automotive Applications
  27. Does Job Satisfaction Adapt to Working Conditions?
  28. The impact of weather variability and climate change on pesticide applications in the US - An empirical investigation