N3 - A collection of datasets for named entity recognition and disambiguation in the NLP interchange format

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

  • Michael Röder
  • Ricardo Usbeck
  • Sebastian Hellmann
  • Daniel Gerber
  • Andreas Both

Extracting Linked Data following the Semantic Web principle from unstructured sources has become a key challenge for scientific research. Named Entity Recognition and Disambiguation are two basic operations in this extraction process. One step towards the realization of the Semantic Web vision and the development of highly accurate tools is the availability of data for validating the quality of processes for Named Entity Recognition and Disambiguation as well as for algorithm tuning. This article presents three novel, manually curated and annotated corpora (N3). All of them are based on a free license and stored in the NLP Interchange Format to leverage the Linked Data character of our datasets.

Original languageEnglish
Title of host publicationProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014
EditorsNicoletta Calzolari, Khalid Choukri, Sara Goggi, Thierry Declerck, Joseph Mariani, Bente Maegaard, Asuncion Moreno, Jan Odijk, Helene Mazo, Stelios Piperidis, Hrafn Loftsson
Number of pages5
Place of PublicationReykjavik, Iceland
PublisherEuropean Language Resources Association (ELRA)
Publication date05.2014
Pages3529-3533
ISBN (electronic)9782951740884
Publication statusPublished - 05.2014
Externally publishedYes
Event9th International Conference on Language Resources and Evaluation, LREC 2014 - Reykjavik, Iceland
Duration: 26.05.201431.05.2014
Conference number: 9
http://www.lrec-conf.org/proceedings/lrec2014/index.html

Bibliographical note

We thank Luise Erfurth and Didier Cherix for helping us creating annotations of
the datasets and Jens Lehmann for his feedback. A special thanks goes to news.de for allowing us to use their articles. Parts of this work were supported by the ESF and
the Free State of Saxony.

ACL materials are Copyright © 1963–2023

Links

Recently viewed

Publications

  1. Managing Business Process in Distributed Systems: Requirements, Models, and Implementation
  2. Fostering Circularity: Building a Local Community and Implementing Circular Processes
  3. Modeling and Performance Analysis of a Node in Fault Tolerant Wireless Sensor Networks
  4. An on-line orthogonal wavelet denoising algorithm for high-resolution surface scans
  5. ACL–adaptive correction of learning parameters for backpropagation based algorithms
  6. Development of a quality assurance framework for the open source development model
  7. Preventive Emergency Detection Based on the Probabilistic Evaluation of Distributed, Embedded Sensor Networks
  8. Throttle valve control using an inverse local linear model tree based on a Fuzzy neural network
  9. Finding Similar Movements in Positional Data Streams
  10. Neural Network-Based Finite-Time Control for Stochastic Nonlinear Systems with Input Dead-Zone and Saturation
  11. N-term approximation in anisotropic function spaces
  12. A change of values is in the air
  13. Design of a Real Time Path of Motion Using a Sliding Mode Control with a Switching Surface
  14. The scaled boundary finite element method for computational homogenization of heterogeneous media
  15. Digital Control of a Camless Engine Using Lyapunov Approach with Backward Euler Approximation
  16. Diffusion-driven microstructure evolution in OpenCalphad
  17. Evaluating OWL 2 reasoners in the context of checking entity-relationship diagrams during software development
  18. Using trait-based filtering as a predictive framework for conservation
  19. Tracing exploratory modes in digital collections of museum Web sites using reverse information architecture