Optimization in Deep Learning: Structured, Realistic and Interpretable Learning for Decision-Making

Tsiourvas, Asterios

dc.contributor.advisor	Perakis, Georgia
dc.contributor.author	Tsiourvas, Asterios
dc.date.accessioned	2026-01-12T19:41:24Z
dc.date.available	2026-01-12T19:41:24Z
dc.date.issued	2025-09
dc.date.submitted	2025-08-21T14:15:15.444Z
dc.identifier.uri	https://hdl.handle.net/1721.1/164513
dc.description.abstract	In recent years, deep learning has emerged as a powerful tool for data-driven decisionmaking. However, its adoption in high-stakes applications is often constrained by challenges related to interpretability, fairness, and generalization in structured or complex environments. This thesis develops new optimization methodologies to enhance the realism, structureawareness, and interpretability of deep learning models in decision-making tasks. We begin, in Chapter 2, by addressing the challenge of optimizing trained neural networks for data-driven decision-making. Although neural networks can encode rich representations of preferences or outcomes, directly optimizing their outputs can be computationally intractable and often may produce unrealistic prescriptions. We introduce scalable algorithms that leverage the piecewise-linear structure of ReLU networks, reducing the original hard-to-solve mixed-integer program into tractable linear programs. To ensure realism, we introduce constraints that restrict decisions to lie on the data manifold. We then extend this framework to any differentiable neural network or MIP-expressible model and show that it scales for networks with millions of parameters. In Chapter 3, we focus on decision-making under observational data. First, we study personalized treatment recommendations under discrete treatments. We introduce the Prescriptive ReLU (P-ReLU) network, a piecewise-linear model that partitions the input space into polyhedral regions, assigning treatments uniformly within each, and that can be translated into an equivalent interpretable decision tree. We demonstrate that P-ReLU achieves strong prescriptive accuracy and accommodates structural/prescriptive constraints with ease. Next, we consider the problem of large language model (LLM) routing, where a query must be dynamically routed to the best model under competing metrics like accuracy and cost. We develop a causal, end-to-end approach that learns routing policies directly from logged observational data, minimizing directly decision-making regret. Finally, we tackle the problem of generating realistic, manifold-aligned counterfactual explanations. To address this problem, we present a MIP formulation where we explicitly enforce manifold alignment by reformulating the highly nonlinear Local Outlier Factor (LOF) metric as a set of mixed-integer constraints. To address the computational challenge, we leverage the geometry of the network and propose an efficient decomposition scheme that reduces the initial hard-to-solve problem into a series of significantly smaller, easier-to-solve problems. We further extend this framework to any differentiable neural network or MIP-expressible machine learning model. In Chapter 4, we focus on structured machine learning. We first address the problem of hierarchical time series forecasting, where predictions must be both accurate and consistent with the aggregation structure of the hierarchy. While prior methods rely on fixed projection matrices, we propose learning the optimal oblique projection directly from data. The proposed end-to-end approach jointly trains the forecasting model and projection layer, significantly improving accuracy and coherence. Next, we study the problem of creating a highly expressive, interpretable, and fair machine learning model. We propose Neural-Informed Decision Trees (NIDTs), a model that combines the predictive power of neural networks with the inherent interpretability of decision trees. NIDTs use axis-aligned splits on dataset features to form transparent decision paths, and at each leaf, apply a linear predictor based on both the original features and neural embeddings from a task-specific network to capture non-linearities. To generate NIDTs, we develop a decomposition training scheme that supports direct integration of fairness constraints via a constrained convex optimization problem solved at each leaf. Finally, in Chapter 5, we address fairness and efficiency in emergency department (ED) operations, where prolonged length of stay (LOS) has been linked to adverse outcomes such as increased mortality and higher risk of hospital-acquired infections. We focus on the patient prioritization and placement aspects of ED operations to improve throughput and reduce wait times. We propose a novel MIP predictive-prescriptive framework that decomposes predicted LOS into actionable components, enabling a more granular and operationally meaningful model of ED dynamics. Fairness considerations are explicitly incorporated into the formulation. To address uncertainty, we introduce a sampling-based solution approach. Our method increases ED throughput by 50–100% and reduces average wait time by 50–75%, depending on current utilization levels, while achieving near-optimal performance compared to a clairvoyant oracle. This work was conducted in collaboration with a major U.S. academic medical center. To facilitate practical implementation, we also design an interpretable metamodel that approximates the predictive-prescriptive algorithm with high fidelity. Together, these contributions provide a unified perspective on deep learning for reliable decision-making, grounded in optimization and encompassing interpretability, structure-awareness, and causal reasoning, well-suited for high-stakes operational environments.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Optimization in Deep Learning: Structured, Realistic and Interpretable Learning for Decision-Making
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Operations Research Center
dc.contributor.department	Sloan School of Management
dc.identifier.orcid	https://orcid.org/0000-0002-2979-6300
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: tsiourvas-atsiour-phd-orc-2025 ...
Size:: 11.58Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record