Cross-document coreference resolution using latent features
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Authors
Over the last years, entity detection approaches which combine named entity recognition and entity linking have been used to detect mentions of RDF resources from a given reference knowledge base in unstructured data. In this paper, we address the problem of assigning a single URI to named entities which stand for the same real-object across documents but are not yet available in the reference knowledge base. This task is known as cross-document co-reference resolution and has been addressed by manifold approaches in the past. We present a preliminary study of a novel take on the task based on the use of latent features derived from matrix factorizations combined with parameter-free graph clustering. We study the influence of different parameters (window size, rank, hardening) on our approach by comparing the F-measures we achieve on the N3 benchmark. Our results suggest that using latent features leads to higher F-measures with an increase of up to 20.5% on datasets of the N3 collection.
| Original language | English | 
|---|---|
| Title of host publication | Linked Data for Information Extraction 2014. : Proceedings of the Second International Workshop on Linked Data for Information Extraction (LD4IE 2014), Riva del Garda, Italy, October 20, 2014. | 
| Editors | Anna Lisa Gentile, Ziqi Zhang, Claudia d'Amato, Heiko Paulheim | 
| Number of pages | 12 | 
| Volume | 1267 | 
| Publisher | Sun Site Central Europe (RWTH Aachen University) | 
| Publication date | 15.10.2014 | 
| Pages | 33-44 | 
| Publication status | Published - 15.10.2014 | 
| Externally published | Yes | 
| Event | 2nd International Workshop on Linked Data for Information Extraction, LD4IE 2014, Co-located with the 13th International Semantic Web Conference, ISWC 2014 - Riva del Garda, Italy Duration: 20.10.2014 → … http://iswc2014.semanticweb.org/index.html | 
Bibliographical note
European Science Foundation
- Informatics
