The multi-armed bandit (MAB) model is one of the most classical models to study decision-making in an uncertain environment. In this model, a player chooses one of K possible arms of a bandit machine to play at each time step, where the corresponding arm returns a random reward to the player, potentially from a specific unknown distribution. The target of the player is to collect as many rewards as possible during the process. Despite its simplicity, the MAB model offers an excellent playground for studying the trade-off between exploration vs exploitation and designing effective algorithms for sequential decision-making under uncertainty. Although many asymptotically optimal algorithms have been established, the finite-time behaviors of the stochastic dynamics of the MAB model appear much more challenging to analyze due to the intertwine between the decision-making and the rewards being collected. In this paper, we employ techniques in statistical physics to analyze the MAB model, which facilitates the characterization of the distribution of cumulative regrets at a finite short time, the central quantity of interest in an MAB algorithm, as well as the intricate dynamical behaviors of the model. Our analytical results, in good agreement with simulations, point to the emergence of an interesting multimodal regret distribution, with large regrets resulting from excess exploitation of sub-optimal arms due to an initial unlucky output from the optimal one.

Download full-text PDF

Source
http://dx.doi.org/10.1063/5.0120076DOI Listing

Publication Analysis

Top Keywords

mab model
16
stochastic dynamics
8
sequential decision-making
8
model
6
mab
5
understanding stochastic
4
dynamics sequential
4
decision-making
4
decision-making processes
4
processes path-integral
4

Similar Publications

A novel ROR1-targeting antibody-PROTAC conjugate promotes BRD4 degradation for solid tumor treatment.

Theranostics

January 2025

Engineering Research Center of Cell & Therapeutic Antibody, Ministry of Education, School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China.

Proteolysis Targeting Chimeras (PROTACs) are bifunctional compounds that have been extensively studied for their role in targeted protein degradation (TPD). The capacity to degrade validated or undruggable targets provides PROTACs with significant potency in cancer therapy. However, the clinical application of PROTACs is limited by their poor potency and unfavorable pharmacokinetic properties.

View Article and Find Full Text PDF

Background: Programmed cell death ligand 1 (PD-L1) expression on immune cells is correlated with the efficacy of immune checkpoint inhibitor (ICI) therapy in various types of cancer. Platelets are important components of the tumour microenvironment (TME) and are widely involved in the development of many types of cancer including colorectal cancer (CRC). However, the role of PD-L1 positive platelets in ICI therapy for CRC remains unknown.

View Article and Find Full Text PDF

Background: Multiple myeloma (MM) clinical management is challenging owing to its relapse and refractoriness to treatment. Understanding the treatment patterns and refractory dynamics is crucial for optimizing patient care. This study aimed to estimate the evolution of MM according to the treatment line and refractoriness status in Italy.

View Article and Find Full Text PDF

Monoclonal antibodies (mAbs) are critical components in the therapeutic landscape, but their dosing strategies often evolve post-approval as new data emerge. This review evaluates post-marketing label changes in dosing information for FDA-approved mAbs from January 2015 to September 2024, with a focus on both initial and extended indications. We systematically analyzed dosing modifications, categorizing them into six predefined groups: Dose increases or decreases, inclusion of new patient populations by body weight or age, shifts from body weight-based dosing to fixed regimens, and adjustments in infusion rates.

View Article and Find Full Text PDF

Structural determinants of peanut induced anaphylaxis.

J Allergy Clin Immunol

January 2025

Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Vanderbilt University, Nashville, TN; Department of Pharmacology, Vanderbilt University Medical Center, Vanderbilt University, Nashville, TN. Electronic address:

Background: Human monoclonal IgE antibodies recognizing peanut allergens have recently become available, but we lack a detailed understanding of how these IgEs target allergens.

Objective: To determine the molecular details of the antibody-allergen interaction for a panel of clinically important human IgE monoclonal antibodies and to develop strategies to disrupt disease causing antibody-allergen interactions.

Methods: We identified candidates from a panel of epitope binned human IgE monoclonals that recognize two important and homologous peanut allergens, Ara h 2 and Ara h 6.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!