Learning shortest paths in word graphs
Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet
Authors
The vast amount of information on the Web drives the need
for aggregation and summarisation techniques. We study event extraction
as a text summarisation task using redundant sentences which is also
known as sentence compression. Given a set of sentences describing the
same event, we aim at generating a summarisation that is (i) a single sentence,
(ii) simply structured and easily understandable, and (iii) minimal
in terms of the number of words/tokens. Existing approaches for sentence
compression are often based on fnding the shortest path in word graphs
that is spanned by related input sentences. These approaches, however,
deploy manually crafted heuristics for edge weights and lack theoretical
justifcation. In this paper, we cast sentence compression as a structured
prediction problem. Edges of the compression graph are represented by
features drawn from adjacent nodes so that corresponding weights are
learned by a generalised linear model. Decoding is performed in polynomial
time by a generalised shortest path algorithm using loss augmented
inference. We report on preliminary results on artifcial and real world
data.
for aggregation and summarisation techniques. We study event extraction
as a text summarisation task using redundant sentences which is also
known as sentence compression. Given a set of sentences describing the
same event, we aim at generating a summarisation that is (i) a single sentence,
(ii) simply structured and easily understandable, and (iii) minimal
in terms of the number of words/tokens. Existing approaches for sentence
compression are often based on fnding the shortest path in word graphs
that is spanned by related input sentences. These approaches, however,
deploy manually crafted heuristics for edge weights and lack theoretical
justifcation. In this paper, we cast sentence compression as a structured
prediction problem. Edges of the compression graph are represented by
features drawn from adjacent nodes so that corresponding weights are
learned by a generalised linear model. Decoding is performed in polynomial
time by a generalised shortest path algorithm using loss augmented
inference. We report on preliminary results on artifcial and real world
data.
Originalsprache | Englisch |
---|---|
Titel | Knowledge Discovery, Data Mining and Machi- ne Learning (KDML-2013) |
Herausgeber | Andreas Henrich, Hans-Christian Sperker |
Anzahl der Seiten | 4 |
Erscheinungsort | Bamberg |
Verlag | Lehrstuhl für Medieninformatik - Universität Bamberg |
Erscheinungsdatum | 2014 |
Seiten | 113-116 |
Publikationsstatus | Erschienen - 2014 |
Extern publiziert | Ja |
Veranstaltung | Lernen, Wissen und Adaptivität - LWA 2013 - Bamberg, Deutschland Dauer: 07.10.2013 → 09.10.2013 http://www.minf.uni-bamberg.de/lwa2013/ |
- Informatik
- Wirtschaftsinformatik