On multilabel classification and ranking with bandit feedback

IRIS - Institutional Research Information System
IRIS è il sistema di gestione integrata dei dati della ricerca (persone, progetti, pubblicazioni, attività) adottato dall'Università degli Studi dell’Insubria.

IRInSubria - Institutional Repository Insubria
IRInSubria raccoglie, conserva, documenta e dissemina le informazioni sulla produzione scientifica dell'Università degli Studi dell’Insubria anche ai fini della valutazione della ricerca.

We present a novel multilabel/ranking algorithm working in partial information settings. The algorithm is based on 2nd-order descent methods, and relies on upper-confidence bounds to trade-off exploration and exploitation. We analyze this algorithm in a partial adversarial setting, where covariates can be adversarial, but multilabel probabilities are ruled by (generalized) linear models. We show O(T1/2 log T) regret bounds, which improve in several ways on the existing results. We test the effectiveness of our upper-confidence scheme by contrasting against full-information baselines on diverse real-world multilabel data sets, often obtaining comparable performance.

On multilabel classification and ranking with bandit feedback

GENTILE, CLAUDIO;Orabona, F.

2014-01-01

Abstract

We present a novel multilabel/ranking algorithm working in partial information settings. The algorithm is based on 2nd-order descent methods, and relies on upper-confidence bounds to trade-off exploration and exploitation. We analyze this algorithm in a partial adversarial setting, where covariates can be adversarial, but multilabel probabilities are ruled by (generalized) linear models. We show O(T1/2 log T) regret bounds, which improve in several ways on the existing results. We test the effectiveness of our upper-confidence scheme by contrasting against full-information baselines on diverse real-world multilabel data sets, often obtaining comparable performance.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2014
			
	Rivista
	
				JOURNAL OF MACHINE LEARNING RESEARCH
			
	Codice Web of Science
	
				WOS:000344638400003
			
	Codice Scopus
	
				2-s2.0-84907374351
			
	Parole chiave
	
				Contextual bandits; Generalized linear; Online learning; Ranking; Regret bounds; Structured prediction
			
	Tutti gli autori
	
						Gentile, Claudio; Orabona, F.
					
	Appare nelle tipologie:
	
				Articolo su Rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/1959521

Citazioni

ND

15

11

social impact