We present a novel multilabel/ranking algorithm working in partial information settings. The algorithm is based on 2nd-order descent methods, and relies on upper-confidence bounds to trade-off exploration and exploitation. We analyze this algorithm in a partial adversarial setting, where covariates can be adversarial, but multilabel probabilities are ruled by (generalized) linear models. We show O(T1/2 log T) regret bounds, which improve in several ways on the existing results. We test the effectiveness of our upper-confidence scheme by contrasting against full-information baselines on diverse real-world multilabel data sets, often obtaining comparable performance.
|Data di pubblicazione:||2014|
|Titolo:||On multilabel classification and ranking with bandit feedback|
|Rivista:||JOURNAL OF MACHINE LEARNING RESEARCH|
|Codice identificativo ISI:||WOS:000344638400003|
|Codice identificativo Scopus:||2-s2.0-84907374351|
|Parole Chiave:||Contextual bandits; Generalized linear; Online learning; Ranking; Regret bounds; Structured prediction|
|Appare nelle tipologie:||Articolo su Rivista|