Cross-document coreference resolution using latent features
Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet
Authors
Over the last years, entity detection approaches which combine named entity recognition and entity linking have been used to detect mentions of RDF resources from a given reference knowledge base in unstructured data. In this paper, we address the problem of assigning a single URI to named entities which stand for the same real-object across documents but are not yet available in the reference knowledge base. This task is known as cross-document co-reference resolution and has been addressed by manifold approaches in the past. We present a preliminary study of a novel take on the task based on the use of latent features derived from matrix factorizations combined with parameter-free graph clustering. We study the influence of different parameters (window size, rank, hardening) on our approach by comparing the F-measures we achieve on the N3 benchmark. Our results suggest that using latent features leads to higher F-measures with an increase of up to 20.5% on datasets of the N3 collection.
Originalsprache | Englisch |
---|---|
Titel | Linked Data for Information Extraction 2014. : Proceedings of the Second International Workshop on Linked Data for Information Extraction (LD4IE 2014), Riva del Garda, Italy, October 20, 2014. |
Herausgeber | Anna Lisa Gentile, Ziqi Zhang, Claudia d'Amato, Heiko Paulheim |
Anzahl der Seiten | 12 |
Band | 1267 |
Verlag | Sun Site Central Europe (RWTH Aachen University) |
Erscheinungsdatum | 15.10.2014 |
Seiten | 33-44 |
Publikationsstatus | Erschienen - 15.10.2014 |
Extern publiziert | Ja |
Veranstaltung | 2nd International Workshop on Linked Data for Information Extraction, LD4IE 2014, Co-located with the 13th International Semantic Web Conference, ISWC 2014 - Riva del Garda, Italien Dauer: 20.10.2014 → … http://iswc2014.semanticweb.org/index.html |
- Informatik