MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

On the Sample Efficiency of Data-Driven Decision Making

Author(s)
Qian, Jian
Thumbnail
DownloadThesis PDF (1.964Mb)
Advisor
Rakhlin, Alexander
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
This thesis studies the fundamental problem of decision making under uncertainty through the lens of statistical decision theory. We characterize the minimax risk, which captures the sample efficiency required for effective decision making across three key settings: offline estimation with batch data, online estimation with sequential data, and interactive decision making as exemplified by multi-armed bandits and reinforcement learning. The first part of the thesis develops novel algorithmic and theoretical tools to enhance decision making in these regimes and to bridge the gaps between them. We revisit logistic regression in the offline setting and provide guarantees without restrictive boundedness assumptions. We then propose meta-algorithms that reduce online estimation to offline estimation, enabling any offline estimator to be used effectively in online scenarios. Furthermore, we present general-purpose algorithms for interactive decision making problems by leveraging offline or online estimation techniques. The second part of the thesis introduces a unified approach to understanding the fundamental complexity of interactive decision making. We propose the Decision Making with Structured Observation (DMSO) framework, which encompasses bandits, reinforcement learning, and more general settings. Within this framework, we develop a new complexity measure—the Decision-Estimation Coefficient (DEC)—which captures both upper and lower bounds for minimax regret. DEC extends classical notions such as the modulus of continuity to interactive scenarios by introducing an adaptive variant of Le Cam’s method. Finally, we unify the three classical lower bound techniques—Le Cam’s method, Assouad’s lemma, and Fano’s inequality—through a generalized formulation that also incorporates the DEC, offering a comprehensive understanding of the minimax risk in decision making tasks.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/164162
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.