p-norm multiple kernel learning

Research output: Journal contributionsJournal articlesResearchpeer-review

Standard

p-norm multiple kernel learning. / Kloft, Marius; Brefeld, Ulf; Sonnenburg, Sören et al.
In: Journal of Machine Learning Research, Vol. 12, 03.2011, p. 953-997.

Research output: Journal contributionsJournal articlesResearchpeer-review

Harvard

APA

Vancouver

Kloft M, Brefeld U, Sonnenburg S, Zien A. p-norm multiple kernel learning. Journal of Machine Learning Research. 2011 Mar;12:953-997.

Bibtex

@article{1c0bf91874a94030a4f767b1adcafcde,
title = "ℓp-norm multiple kernel learning",
abstract = "Learning linear combinations of multiple kernels is an appealing strategy when the right choice of features is unknown. Previous approaches to multiple kernel learning (MKL) promote sparse kernel combinations to support interpretability and scalability. Unfortunately, this ℓ1norm MKL is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we extend MKL to arbitrary norms. We devise new insights on the connection between several existing MKL formulations and develop two efficient interleaved optimization strategies for arbitrary norms, that is ℓp -norms with p ≥ 1. This interleaved optimization is much faster than the commonly used wrapper approaches, as demonstrated on several data sets. A theoretical analysis and an experiment on controlled artificial data shed light on the appropriateness of sparse, non-sparse and ℓ∞-norm MKL in various scenarios. Importantly, empirical applications of ℓp-norm MKL to three real-world problems from computational biology show that non-sparse MKL achieves accuracies that surpass the state-of-the-art. Data sets, source code to reproduce the experiments, implementations of the algorithms, and further information are available at http://doc.ml.tu-berlin.de/nonsparse-mkl/.",
keywords = "Bioinformatics, Block coordinate descent, Convex conjugate, Generalization bounds, Large scale optimization, Learning kernels, Multiple kernel learning, Non-sparse, Rademacher complexity, Support vector machine, Informatics",
author = "Marius Kloft and Ulf Brefeld and S{\"o}ren Sonnenburg and Alexander Zien",
year = "2011",
month = mar,
language = "English",
volume = "12",
pages = "953--997",
journal = "Journal of Machine Learning Research",
issn = "1532-4435",
publisher = "MIT Press",

}

RIS

TY - JOUR

T1 - ℓp-norm multiple kernel learning

AU - Kloft, Marius

AU - Brefeld, Ulf

AU - Sonnenburg, Sören

AU - Zien, Alexander

PY - 2011/3

Y1 - 2011/3

N2 - Learning linear combinations of multiple kernels is an appealing strategy when the right choice of features is unknown. Previous approaches to multiple kernel learning (MKL) promote sparse kernel combinations to support interpretability and scalability. Unfortunately, this ℓ1norm MKL is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we extend MKL to arbitrary norms. We devise new insights on the connection between several existing MKL formulations and develop two efficient interleaved optimization strategies for arbitrary norms, that is ℓp -norms with p ≥ 1. This interleaved optimization is much faster than the commonly used wrapper approaches, as demonstrated on several data sets. A theoretical analysis and an experiment on controlled artificial data shed light on the appropriateness of sparse, non-sparse and ℓ∞-norm MKL in various scenarios. Importantly, empirical applications of ℓp-norm MKL to three real-world problems from computational biology show that non-sparse MKL achieves accuracies that surpass the state-of-the-art. Data sets, source code to reproduce the experiments, implementations of the algorithms, and further information are available at http://doc.ml.tu-berlin.de/nonsparse-mkl/.

AB - Learning linear combinations of multiple kernels is an appealing strategy when the right choice of features is unknown. Previous approaches to multiple kernel learning (MKL) promote sparse kernel combinations to support interpretability and scalability. Unfortunately, this ℓ1norm MKL is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we extend MKL to arbitrary norms. We devise new insights on the connection between several existing MKL formulations and develop two efficient interleaved optimization strategies for arbitrary norms, that is ℓp -norms with p ≥ 1. This interleaved optimization is much faster than the commonly used wrapper approaches, as demonstrated on several data sets. A theoretical analysis and an experiment on controlled artificial data shed light on the appropriateness of sparse, non-sparse and ℓ∞-norm MKL in various scenarios. Importantly, empirical applications of ℓp-norm MKL to three real-world problems from computational biology show that non-sparse MKL achieves accuracies that surpass the state-of-the-art. Data sets, source code to reproduce the experiments, implementations of the algorithms, and further information are available at http://doc.ml.tu-berlin.de/nonsparse-mkl/.

KW - Bioinformatics

KW - Block coordinate descent

KW - Convex conjugate

KW - Generalization bounds

KW - Large scale optimization

KW - Learning kernels

KW - Multiple kernel learning

KW - Non-sparse

KW - Rademacher complexity

KW - Support vector machine

KW - Informatics

UR - http://www.scopus.com/inward/record.url?scp=79955848223&partnerID=8YFLogxK

M3 - Journal articles

AN - SCOPUS:79955848223

VL - 12

SP - 953

EP - 997

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

SN - 1532-4435

ER -