N3 - A collection of datasets for named entity recognition and disambiguation in the NLP interchange format

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

  • Michael Röder
  • Ricardo Usbeck
  • Sebastian Hellmann
  • Daniel Gerber
  • Andreas Both

Extracting Linked Data following the Semantic Web principle from unstructured sources has become a key challenge for scientific research. Named Entity Recognition and Disambiguation are two basic operations in this extraction process. One step towards the realization of the Semantic Web vision and the development of highly accurate tools is the availability of data for validating the quality of processes for Named Entity Recognition and Disambiguation as well as for algorithm tuning. This article presents three novel, manually curated and annotated corpora (N3). All of them are based on a free license and stored in the NLP Interchange Format to leverage the Linked Data character of our datasets.

Original languageEnglish
Title of host publicationProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014
EditorsNicoletta Calzolari, Khalid Choukri, Sara Goggi, Thierry Declerck, Joseph Mariani, Bente Maegaard, Asuncion Moreno, Jan Odijk, Helene Mazo, Stelios Piperidis, Hrafn Loftsson
Number of pages5
Place of PublicationReykjavik, Iceland
PublisherEuropean Language Resources Association (ELRA)
Publication date05.2014
Pages3529-3533
ISBN (electronic)9782951740884
Publication statusPublished - 05.2014
Externally publishedYes
Event9th International Conference on Language Resources and Evaluation, LREC 2014 - Reykjavik, Iceland
Duration: 26.05.201431.05.2014
Conference number: 9
http://www.lrec-conf.org/proceedings/lrec2014/index.html

Bibliographical note

We thank Luise Erfurth and Didier Cherix for helping us creating annotations of
the datasets and Jens Lehmann for his feedback. A special thanks goes to news.de for allowing us to use their articles. Parts of this work were supported by the ESF and
the Free State of Saxony.

ACL materials are Copyright © 1963–2023

Links

Recently viewed

Publications

  1. Who benefits from indirect prevention and treatment of depression using an online intervention for insomnia?
  2. Mechanical performance prediction for friction riveting joints of dissimilar materials via machine learning
  3. "Noch kein Schriftsteller hat die Wirklichkeit so beschrieben wie sie wirklich ist das ist das Fürchterlicht"
  4. Bearbeitung mathematischer Problemlöseaufgaben unterstützt durch papier- und videobasierte Lösungsbeispiele
  5. Reducing aquatic micropollutants – Increasing the focus on input prevention and integrated emission management
  6. Towards a Model for Building Trust and Acceptance of Artificial Intelligence Aided Medical Assessment Systems
  7. Kompositionseffekte bei der Notenvergabe in Mathematik auf der 4. Schulstufe der österreichischen Volksschule
  8. Mapping the vegetation of southern mongolian protected areas: application of GIS and remote sensing techniques
  9. Talks about sustainability—Sustainable talks? communicative construction of the social fiction of sustainability
  10. Untersuchung der komplexen Struktur des Konstituentenquarks in der semiinklusiven Elektroproduktion von Mesonen
  11. Description of a new species of Anchomenidius Heyden 1880 from the Montes de Leön in north-west Spain (Carabidae)
  12. Rapid Identification of Bacteria in Clinical Microbiology Routiine Diagnostics using MALDI-TOF mass spectrometry
  13. Small patches can be valuable for biodiversity conservation: two case studies on birds in southeastern Australia
  14. Degraded creep resistance induced by static precipitation strengthening in high-pressure die casting Mg-Al-Sm alloy
  15. Convergence or mediation? Experts of vulnerability and the vulnerability of experts' discourses on nanotechnologies