Approximate tree kernels

Research output: Journal contributionsJournal articlesResearchpeer-review

Authors

  • Konrad Rieck
  • Tammo Krueger
  • Ulf Brefeld
  • Klaus Robert Müller

Convolution kernels for trees provide simple means for learning with tree-structured data. The computation time of tree kernels is quadratic in the size of the trees, since all pairs of nodes need to be compared. Thus, large parse trees, obtained from HTML documents or structured network data, render convolution kernels inapplicable. In this article, we propose an effective approximation technique for parse tree kernels. The approximate tree kernels (ATKs) limit kernel computation to a sparse subset of relevant subtrees and discard redundant structures, such that training and testing of kernel-based learning methods are significantly accelerated. We devise linear programming approaches for identifying such subsets for supervised and unsupervised learning tasks, respectively. Empirically, the approximate tree kernels attain run-time improvements up to three orders of magnitude while preserving the predictive accuracy of regular tree kernels. For unsupervised tasks, the approximate tree kernels even lead to more accurate predictions by identifying relevant dimensions in feature space.

Original languageEnglish
JournalJournal of Machine Learning Research
Volume11
Pages (from-to)555-580
Number of pages26
ISSN1532-4435
Publication statusPublished - 02.2010
Externally publishedYes

Documents

Links

Recently viewed

Publications

  1. Mathematical Modeling for Robot 3D Laser Scanning in Complete Darkness Environments to Advance Pipeline Inspection
  2. Introducing split orders and optimizing operational policies in robotic mobile fulfillment systems
  3. Metrics for Experimentation Programs: Categories, Benefits and Challenges
  4. Scholarly Question Answering Using Large Language Models in the NFDI4DataScience Gateway
  5. Application of design of experiments for laser shock peening process optimization
  6. A survey of empirical studies using transaction level data on exports and imports
  7. A Wavelet Packet Algorithm for Online Detection of Pantograph Vibrations
  8. Processing of CSR communication: insights from the ELM
  9. Experimentally established correlation of friction surfacing process temperature and deposit geometry
  10. Guest Editorial - ''Econometrics of Anonymized Micro Data''
  11. Performance Saga: Interview 01
  12. Active learning for network intrusion detection
  13. A Lyapunov based PI controller with an anti-windup scheme for a purification process of potable water
  14. Embarrassment as a public vs. private emotion and symbolic coping behaviour
  15. Intraspecific trait variation increases species diversity in a trait-based grassland model
  16. »HOW TO MAKE YOUR OWN SAMPLES«
  17. Imaginary practices as the nexus between continuity and disruptive change
  18. Polar Coordinates and Interactive Learning
  19. Developing a sustainable platform for entity annotation benchmarks
  20. Meta-Image – a collaborative environment for the image discourse
  21. Gaining deep leverage? Reflecting and shaping real-world lab impacts through leverage points