| dc.contributor.advisor | Simchi-Levi, David | |
| dc.contributor.author | Ai, Rui | |
| dc.date.accessioned | 2025-11-05T19:33:20Z | |
| dc.date.available | 2025-11-05T19:33:20Z | |
| dc.date.issued | 2025-05 | |
| dc.date.submitted | 2025-07-16T16:02:27.376Z | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/163541 | |
| dc.description.abstract | The independence axiom (IA) proposed by Von Neumann and Morgenstern [50] is the cornerstone of the expected utility theory. However, some empirical experiments show that the IA is often violated in the real world. We propose a new kind of multi-armed bandit problem where the expectation of outcomes may influence the agent’s utility which we call expectation-dependent multi-armed bandits and rationalize the choice of agents in Machina’s paradox lacking the IA. We design provably efficient algorithms with low minimax regrets and show their consistency of time horizon T with corresponding regret lower bounds, revealing statistical optimality. Furthermore, as we first consider bandits whose corresponding utility depends on both reality and expectation, it provides a bridge between machine learning and economic behavior theory, shedding light on how to interpret some counterintuitive economic scenarios, like bounded rationality explored by Zhang et al. [54]. | |
| dc.publisher | Massachusetts Institute of Technology | |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) | |
| dc.rights | Copyright retained by author(s) | |
| dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
| dc.title | Problem-Independent Regrets on Expectation-Dependent Multi-Armed Bandits | |
| dc.type | Thesis | |
| dc.description.degree | S.M. | |
| dc.contributor.department | Massachusetts Institute of Technology. Institute for Data, Systems, and Society | |
| dc.identifier.orcid | https://orcid.org/0009-0005-9262-0630 | |
| mit.thesis.degree | Master | |
| thesis.degree.name | Master of Science in Social and Engineering Systems | |