Cross-document coreference resolution using latent features
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Standard
Linked Data for Information Extraction 2014. : Proceedings of the Second International Workshop on Linked Data for Information Extraction (LD4IE 2014), Riva del Garda, Italy, October 20, 2014.. ed. / Anna Lisa Gentile; Ziqi Zhang; Claudia d'Amato; Heiko Paulheim. Vol. 1267 Sun Site Central Europe (RWTH Aachen University), 2014. p. 33-44 (CEUR Workshop Proceedings; Vol. 1267).
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - CHAP
T1 - Cross-document coreference resolution using latent features
AU - Ngonga Ngomo, Axel Cyrille
AU - Röder, Michael
AU - Usbeck, Ricardo
N1 - European Science Foundation
PY - 2014/10/15
Y1 - 2014/10/15
N2 - Over the last years, entity detection approaches which combine named entity recognition and entity linking have been used to detect mentions of RDF resources from a given reference knowledge base in unstructured data. In this paper, we address the problem of assigning a single URI to named entities which stand for the same real-object across documents but are not yet available in the reference knowledge base. This task is known as cross-document co-reference resolution and has been addressed by manifold approaches in the past. We present a preliminary study of a novel take on the task based on the use of latent features derived from matrix factorizations combined with parameter-free graph clustering. We study the influence of different parameters (window size, rank, hardening) on our approach by comparing the F-measures we achieve on the N3 benchmark. Our results suggest that using latent features leads to higher F-measures with an increase of up to 20.5% on datasets of the N3 collection.
AB - Over the last years, entity detection approaches which combine named entity recognition and entity linking have been used to detect mentions of RDF resources from a given reference knowledge base in unstructured data. In this paper, we address the problem of assigning a single URI to named entities which stand for the same real-object across documents but are not yet available in the reference knowledge base. This task is known as cross-document co-reference resolution and has been addressed by manifold approaches in the past. We present a preliminary study of a novel take on the task based on the use of latent features derived from matrix factorizations combined with parameter-free graph clustering. We study the influence of different parameters (window size, rank, hardening) on our approach by comparing the F-measures we achieve on the N3 benchmark. Our results suggest that using latent features leads to higher F-measures with an increase of up to 20.5% on datasets of the N3 collection.
KW - Informatics
UR - http://www.scopus.com/inward/record.url?scp=84939863962&partnerID=8YFLogxK
M3 - Article in conference proceedings
AN - SCOPUS:84939863962
VL - 1267
T3 - CEUR Workshop Proceedings
SP - 33
EP - 44
BT - Linked Data for Information Extraction 2014.
A2 - Gentile, Anna Lisa
A2 - Zhang, Ziqi
A2 - d'Amato, Claudia
A2 - Paulheim, Heiko
PB - Sun Site Central Europe (RWTH Aachen University)
T2 - 2nd International Workshop on Linked Data for Information Extraction, LD4IE 2014, Co-located with the 13th International Semantic Web Conference, ISWC 2014
Y2 - 20 October 2014
ER -