DeFacto - Temporal and multilingual deep fact validation
Research output: Journal contributions › Journal articles › Research › peer-review
Standard
In: Journal of Web Semantics, Vol. 35, 01.12.2015, p. 85-101.
Research output: Journal contributions › Journal articles › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - JOUR
T1 - DeFacto - Temporal and multilingual deep fact validation
AU - Gerber, Daniel
AU - Esteves, Diego
AU - Lehmann, Jens
AU - Bühmann, Lorenz
AU - Usbeck, Ricardo
AU - Ngonga Ngomo, Axel Cyrille
AU - Speck, René
N1 - Publisher Copyright: © 2015 Elsevier B.V.
PY - 2015/12/1
Y1 - 2015/12/1
N2 - One of the main tasks when creating and maintaining knowledge bases is to validate facts and provide sources for them in order to ensure correctness and traceability of the provided knowledge. So far, this task is often addressed by human curators in a three-step process: issuing appropriate keyword queries for the statement to check using standard search engines, retrieving potentially relevant documents and screening those documents for relevant content. The drawbacks of this process are manifold. Most importantly, it is very time-consuming as the experts have to carry out several search processes and must often read several documents. In this article, we present DeFacto (Deep Fact Validation) - an algorithm able to validate facts by finding trustworthy sources for them on the Web. DeFacto aims to provide an effective way of validating facts by supplying the user with relevant excerpts of web pages as well as useful additional information including a score for the confidence DeFacto has in the correctness of the input fact. To achieve this goal, DeFacto collects and combines evidence from web pages written in several languages. In addition, DeFacto provides support for facts with a temporal scope, i.e., it can estimate in which time frame a fact was valid. Given that the automatic evaluation of facts has not been paid much attention to so far, generic benchmarks for evaluating these frameworks were not previously available. We thus also present a generic evaluation framework for fact checking and make it publicly available.
AB - One of the main tasks when creating and maintaining knowledge bases is to validate facts and provide sources for them in order to ensure correctness and traceability of the provided knowledge. So far, this task is often addressed by human curators in a three-step process: issuing appropriate keyword queries for the statement to check using standard search engines, retrieving potentially relevant documents and screening those documents for relevant content. The drawbacks of this process are manifold. Most importantly, it is very time-consuming as the experts have to carry out several search processes and must often read several documents. In this article, we present DeFacto (Deep Fact Validation) - an algorithm able to validate facts by finding trustworthy sources for them on the Web. DeFacto aims to provide an effective way of validating facts by supplying the user with relevant excerpts of web pages as well as useful additional information including a score for the confidence DeFacto has in the correctness of the input fact. To achieve this goal, DeFacto collects and combines evidence from web pages written in several languages. In addition, DeFacto provides support for facts with a temporal scope, i.e., it can estimate in which time frame a fact was valid. Given that the automatic evaluation of facts has not been paid much attention to so far, generic benchmarks for evaluating these frameworks were not previously available. We thus also present a generic evaluation framework for fact checking and make it publicly available.
KW - Fact validation
KW - NLP
KW - Provenance
KW - Web of Data
KW - Informatics
KW - Business informatics
UR - http://www.scopus.com/inward/record.url?scp=84948698827&partnerID=8YFLogxK
U2 - 10.1016/j.websem.2015.08.001
DO - 10.1016/j.websem.2015.08.001
M3 - Journal articles
AN - SCOPUS:84948698827
VL - 35
SP - 85
EP - 101
JO - Journal of Web Semantics
JF - Journal of Web Semantics
SN - 1570-8268
ER -