Cross-document coreference resolution using latent features

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

Over the last years, entity detection approaches which combine named entity recognition and entity linking have been used to detect mentions of RDF resources from a given reference knowledge base in unstructured data. In this paper, we address the problem of assigning a single URI to named entities which stand for the same real-object across documents but are not yet available in the reference knowledge base. This task is known as cross-document co-reference resolution and has been addressed by manifold approaches in the past. We present a preliminary study of a novel take on the task based on the use of latent features derived from matrix factorizations combined with parameter-free graph clustering. We study the influence of different parameters (window size, rank, hardening) on our approach by comparing the F-measures we achieve on the N3 benchmark. Our results suggest that using latent features leads to higher F-measures with an increase of up to 20.5% on datasets of the N3 collection.

OriginalspracheEnglisch
TitelLinked Data for Information Extraction 2014. : Proceedings of the Second International Workshop on Linked Data for Information Extraction (LD4IE 2014), Riva del Garda, Italy, October 20, 2014.
HerausgeberAnna Lisa Gentile, Ziqi Zhang, Claudia d'Amato, Heiko Paulheim
Anzahl der Seiten12
Band1267
VerlagSun Site Central Europe (RWTH Aachen University)
Erscheinungsdatum15.10.2014
Seiten33-44
PublikationsstatusErschienen - 15.10.2014
Extern publiziertJa
Veranstaltung2nd International Workshop on Linked Data for Information Extraction, LD4IE 2014, Co-located with the 13th International Semantic Web Conference, ISWC 2014 - Riva del Garda, Italien
Dauer: 20.10.2014 → …
http://iswc2014.semanticweb.org/index.html