Approximate tree kernels
Research output: Journal contributions › Journal articles › Research › peer-review
Standard
In: Journal of Machine Learning Research, Vol. 11, 02.2010, p. 555-580.
Research output: Journal contributions › Journal articles › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - JOUR
T1 - Approximate tree kernels
AU - Rieck, Konrad
AU - Krueger, Tammo
AU - Brefeld, Ulf
AU - Müller, Klaus Robert
PY - 2010/2
Y1 - 2010/2
N2 - Convolution kernels for trees provide simple means for learning with tree-structured data. The computation time of tree kernels is quadratic in the size of the trees, since all pairs of nodes need to be compared. Thus, large parse trees, obtained from HTML documents or structured network data, render convolution kernels inapplicable. In this article, we propose an effective approximation technique for parse tree kernels. The approximate tree kernels (ATKs) limit kernel computation to a sparse subset of relevant subtrees and discard redundant structures, such that training and testing of kernel-based learning methods are significantly accelerated. We devise linear programming approaches for identifying such subsets for supervised and unsupervised learning tasks, respectively. Empirically, the approximate tree kernels attain run-time improvements up to three orders of magnitude while preserving the predictive accuracy of regular tree kernels. For unsupervised tasks, the approximate tree kernels even lead to more accurate predictions by identifying relevant dimensions in feature space.
AB - Convolution kernels for trees provide simple means for learning with tree-structured data. The computation time of tree kernels is quadratic in the size of the trees, since all pairs of nodes need to be compared. Thus, large parse trees, obtained from HTML documents or structured network data, render convolution kernels inapplicable. In this article, we propose an effective approximation technique for parse tree kernels. The approximate tree kernels (ATKs) limit kernel computation to a sparse subset of relevant subtrees and discard redundant structures, such that training and testing of kernel-based learning methods are significantly accelerated. We devise linear programming approaches for identifying such subsets for supervised and unsupervised learning tasks, respectively. Empirically, the approximate tree kernels attain run-time improvements up to three orders of magnitude while preserving the predictive accuracy of regular tree kernels. For unsupervised tasks, the approximate tree kernels even lead to more accurate predictions by identifying relevant dimensions in feature space.
KW - Approximation
KW - Convolution kernels
KW - Kernel methods
KW - Tree kernels
KW - Informatics
KW - Business informatics
UR - http://www.scopus.com/inward/record.url?scp=77949506401&partnerID=8YFLogxK
M3 - Journal articles
AN - SCOPUS:77949506401
VL - 11
SP - 555
EP - 580
JO - Journal of Machine Learning Research
JF - Journal of Machine Learning Research
SN - 1532-4435
ER -