Logo

Instituto de Ingeniería Matemática y Computacional

Facultad de Matemáticas - Escuela de Ingeniería

Noticias

El Instituto de Ingeniería Matemática y Computacional (IMC) los saluda atentamente y los invita al seminario que se dictará esta semana. 

Nishant Mehta, Department of Computer Science, University of Victoria.

Miércoles 17 de abril de 2024, 13:40 hrs. (Presencial en auditorio Edificio San Agustín; Link Zoom disponible escribiendo a Esta dirección de correo electrónico está siendo protegida contra los robots de spam. Necesita tener JavaScript habilitado para poder verlo.)

ABSTRACT

The sleeping bandits problem is a variant of the classical multi-armed bandit problem. In each of a sequence of rounds: an adversary selects a set of arms (actions) which are available (unavailable arms are "asleep"), the learning algorithm (Learner) then pulls an arm, and finally the adversary sets the loss of each available arm. Learner suffers the loss of the selected arm and observes only that arm's loss. A standard performance measure for this type of problem with changing action sets is the per-action regret: the learning algorithm wishes, for all arms simultaneously, to have cumulative loss not much larger than the cumulative loss of the arm when considering only those rounds in which the arm was available. For this problem, we present the first algorithms which enjoy low per-action regret against reactive (i.e., adaptive) adversaries; moreover, the regret guarantees are near-optimal, unlike previous results which were far from optimal even for oblivious adversaries. Along the way, we will review the simpler, full-information version of the problem, known as sleeping experts. Time permitting, we will also mention various extensions of our results, including (i) new results for bandits with side information from sleeping experts; (ii) guarantees for the adaptive regret and tracking regret for standard (non-sleeping) bandits. This is joint work with my PhD student Quan Nguyen, who should be credited for nearly all the results in this work.

 

Seminario Nishant Mehta