ℓp-norm multiple kernel learning
Research output: Journal contributions › Journal articles › Research › peer-review
Standard
In: Journal of Machine Learning Research, Vol. 12, 03.2011, p. 953-997.
Research output: Journal contributions › Journal articles › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - JOUR
T1 - ℓp-norm multiple kernel learning
AU - Kloft, Marius
AU - Brefeld, Ulf
AU - Sonnenburg, Sören
AU - Zien, Alexander
PY - 2011/3
Y1 - 2011/3
N2 - Learning linear combinations of multiple kernels is an appealing strategy when the right choice of features is unknown. Previous approaches to multiple kernel learning (MKL) promote sparse kernel combinations to support interpretability and scalability. Unfortunately, this ℓ1norm MKL is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we extend MKL to arbitrary norms. We devise new insights on the connection between several existing MKL formulations and develop two efficient interleaved optimization strategies for arbitrary norms, that is ℓp -norms with p ≥ 1. This interleaved optimization is much faster than the commonly used wrapper approaches, as demonstrated on several data sets. A theoretical analysis and an experiment on controlled artificial data shed light on the appropriateness of sparse, non-sparse and ℓ∞-norm MKL in various scenarios. Importantly, empirical applications of ℓp-norm MKL to three real-world problems from computational biology show that non-sparse MKL achieves accuracies that surpass the state-of-the-art. Data sets, source code to reproduce the experiments, implementations of the algorithms, and further information are available at http://doc.ml.tu-berlin.de/nonsparse-mkl/.
AB - Learning linear combinations of multiple kernels is an appealing strategy when the right choice of features is unknown. Previous approaches to multiple kernel learning (MKL) promote sparse kernel combinations to support interpretability and scalability. Unfortunately, this ℓ1norm MKL is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we extend MKL to arbitrary norms. We devise new insights on the connection between several existing MKL formulations and develop two efficient interleaved optimization strategies for arbitrary norms, that is ℓp -norms with p ≥ 1. This interleaved optimization is much faster than the commonly used wrapper approaches, as demonstrated on several data sets. A theoretical analysis and an experiment on controlled artificial data shed light on the appropriateness of sparse, non-sparse and ℓ∞-norm MKL in various scenarios. Importantly, empirical applications of ℓp-norm MKL to three real-world problems from computational biology show that non-sparse MKL achieves accuracies that surpass the state-of-the-art. Data sets, source code to reproduce the experiments, implementations of the algorithms, and further information are available at http://doc.ml.tu-berlin.de/nonsparse-mkl/.
KW - Bioinformatics
KW - Block coordinate descent
KW - Convex conjugate
KW - Generalization bounds
KW - Large scale optimization
KW - Learning kernels
KW - Multiple kernel learning
KW - Non-sparse
KW - Rademacher complexity
KW - Support vector machine
KW - Informatics
UR - http://www.scopus.com/inward/record.url?scp=79955848223&partnerID=8YFLogxK
M3 - Journal articles
AN - SCOPUS:79955848223
VL - 12
SP - 953
EP - 997
JO - Journal of Machine Learning Research
JF - Journal of Machine Learning Research
SN - 1532-4435
ER -