Biomedical Entity Linking with Triple-aware Pre-Training

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschung

Standard

Biomedical Entity Linking with Triple-aware Pre-Training. / Yan, Xi; Möller, Cedric; Usbeck, Ricardo.
Conference XXX. 2023.

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschung

Harvard

APA

Yan, X., Möller, C., & Usbeck, R. (2023). Biomedical Entity Linking with Triple-aware Pre-Training. Manuskript in Vorbereitung. In Conference XXX https://doi.org/10.48550/arXiv.2308.14429

Vancouver

Yan X, Möller C, Usbeck R. Biomedical Entity Linking with Triple-aware Pre-Training. in Conference XXX. 2023 doi: 10.48550/arXiv.2308.14429

Bibtex

@inbook{70dae52eff184dcab552ec79040de9c7,
title = "Biomedical Entity Linking with Triple-aware Pre-Training",
abstract = " Linking biomedical entities is an essential aspect in biomedical natural language processing tasks, such as text mining and question answering. However, a difficulty of linking the biomedical entities using current large language models (LLM) trained on a general corpus is that biomedical entities are scarcely distributed in texts and therefore have been rarely seen during training by the LLM. At the same time, those LLMs are not aware of high level semantic connection between different biomedical entities, which are useful in identifying similar concepts in different textual contexts. To cope with aforementioned problems, some recent works focused on injecting knowledge graph information into LLMs. However, former methods either ignore the relational knowledge of the entities or lead to catastrophic forgetting. Therefore, we propose a novel framework to pre-train the powerful generative LLM by a corpus synthesized from a KG. In the evaluations we are unable to confirm the benefit of including synonym, description or relational information. ",
keywords = "cs.CL, cs.AI, Informatics",
author = "Xi Yan and Cedric M{\"o}ller and Ricardo Usbeck",
year = "2023",
month = aug,
day = "28",
doi = "10.48550/arXiv.2308.14429",
language = "English",
booktitle = "Conference XXX",

}

RIS

TY - CHAP

T1 - Biomedical Entity Linking with Triple-aware Pre-Training

AU - Yan, Xi

AU - Möller, Cedric

AU - Usbeck, Ricardo

PY - 2023/8/28

Y1 - 2023/8/28

N2 - Linking biomedical entities is an essential aspect in biomedical natural language processing tasks, such as text mining and question answering. However, a difficulty of linking the biomedical entities using current large language models (LLM) trained on a general corpus is that biomedical entities are scarcely distributed in texts and therefore have been rarely seen during training by the LLM. At the same time, those LLMs are not aware of high level semantic connection between different biomedical entities, which are useful in identifying similar concepts in different textual contexts. To cope with aforementioned problems, some recent works focused on injecting knowledge graph information into LLMs. However, former methods either ignore the relational knowledge of the entities or lead to catastrophic forgetting. Therefore, we propose a novel framework to pre-train the powerful generative LLM by a corpus synthesized from a KG. In the evaluations we are unable to confirm the benefit of including synonym, description or relational information.

AB - Linking biomedical entities is an essential aspect in biomedical natural language processing tasks, such as text mining and question answering. However, a difficulty of linking the biomedical entities using current large language models (LLM) trained on a general corpus is that biomedical entities are scarcely distributed in texts and therefore have been rarely seen during training by the LLM. At the same time, those LLMs are not aware of high level semantic connection between different biomedical entities, which are useful in identifying similar concepts in different textual contexts. To cope with aforementioned problems, some recent works focused on injecting knowledge graph information into LLMs. However, former methods either ignore the relational knowledge of the entities or lead to catastrophic forgetting. Therefore, we propose a novel framework to pre-train the powerful generative LLM by a corpus synthesized from a KG. In the evaluations we are unable to confirm the benefit of including synonym, description or relational information.

KW - cs.CL

KW - cs.AI

KW - Informatics

U2 - 10.48550/arXiv.2308.14429

DO - 10.48550/arXiv.2308.14429

M3 - Article in conference proceedings

BT - Conference XXX

ER -

DOI

Zuletzt angesehen

Publikationen

  1. Monitoring of methotrexate chlorination in water
  2. Leading digital innovation in schools
  3. Performanceorientiertes Controlling
  4. Models for integrated production-inventory systems
  5. Current Trends in Environmental Cost Accounting - and its Interaction with Eco-Efficiency Performance Measurement and Indicators
  6. A Dual Kalman Filter to Identify Parameters of a Permanent Magnet Synchronous Motor
  7. Terminologien/Semantik
  8. The unadaptable fellow
  9. Understanding european union law
  10. Legal aspects of satellite-based earth observation
  11. Sustainable Redevelopment of Real Estate Properties and Its Social Impact
  12. High-Load Squat Training Improves Sprinting Performance in Junior Elite-Level Soccer Players: A Critically Appraised Topic.
  13. Materialitäten der Kindheit
  14. Beschreibungsmethodik für AAL-Integrationsprofile
  15. Implementation intentions and the willful pursuit of prosocial goals in negotiations
  16. The complementarity of single-species and ecosystem-oriented research in conservation research
  17. Working time dimensions and well-being
  18. Evidence-Based Entrepreneurship
  19. Schelling's Naturalism
  20. Panel Cointegration Testing in the Presence of a Time Trend
  21. The recent double paradigm shift in restoration ecology
  22. Tier
  23. A holistic approach to expatriate management
  24. Ästhetische Bildung der Differenz
  25. Context, contexts and appropriateness
  26. The structure of contributing factors of human error in safety-critical industries
  27. ... address unknown?
  28. SOFTWARE - SOUND-Spiele.
  29. Unobtrusive Detection of Respiratory Rate through UWB-Sensing for Applications of Ambient Assisted Living
  30. Harmful interference and human rights
  31. Pelvis and hips
  32. Sowing density
  33. Associations between the financial and industry expertise of audit committee members and Key Audit Matters within related audit reports
  34. The Settlement of EEZ Fisheries Access Disputes under UNCLOS
  35. Single, Double and Quadruple Maximum Power Point Trackers for a Stand-Alone Photovoltaic System
  36. Tourists’ Weather Perceptions and Weather Related Behavior

Presse / Medien

  1. Weihnachtsfeiern