MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Probabilistic modeling and Bayesian inference via triangular transport

Author(s)
Baptista, Ricardo Miguel
Thumbnail
DownloadThesis PDF (16.67Mb)
Advisor
Marzouk, Youssef
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Probabilistic modeling and Bayesian inference in non-Gaussian settings are pervasive challenges for science and engineering applications. Transportation of measure provides a principled framework for treating non-Gaussianity and for generalizing many methods that rest on Gaussian assumptions. A transport map deterministically couples a simple reference distribution (e.g., a standard Gaussian) to a complex target distribution via a bijective transformation. Finding such a map enables efficient sampling from the target distribution and immediate access to its density. Triangular maps comprise a general class of transports that are attractive from the perspectives of analysis, modeling, and computation. This thesis: (1) develops a general representation for monotone triangular maps, and adaptive methodologies for estimating such maps (and their associated pushforward densities) from samples; (2) uses triangular maps and their compositions to perform Bayesian computation in likelihood-free settings, including new ensemble methods for nonlinear filtering; and (3) proposes parameter and data dimension reduction techniques with error guarantees for high-dimensional inverse problems. The first part of the thesis explores the use of triangular transport maps for density estimation and for learning probabilistic graphical models. To construct triangular maps, we represent monotone functions as smooth transformations of unconstrained (non-monotone) functions. We show how certain structural choices for these transformations lead to smooth optimization problems with no spurious local minima, i.e., where all local minima are global minima. Given samples, we then propose an adaptive algorithm that estimates maps with sparse variable dependence. We demonstrate how this framework enables joint and conditional density estimation across a range of sample sizes, and how it can explicitly learn the Markov properties of a continuous non-Gaussian distribution. To this end, we introduce a consistent estimator for the Markov structure based on integrated Hessian information from the log-density. We then propose an iterative algorithm for learning sparse graphical models by exploiting a corresponding sparsity structure in triangular maps. A core advantage of triangular maps is that their components expose conditionals of the target distribution. Hence, learning a map that depends on both parameters and observations enables efficient sampling from the posterior distribution in a Bayesian inference problem. Crucially, this can be done without evaluating the likelihood function, which is often inaccessible or computationally prohibitive in scientific applications (as with forward models given by stochastic partial differential equations, which we consider here). In the second part of this thesis, we propose and analyze a specific composition of transport maps that directly transforms prior samples into posterior samples. We show that this approach, termed the stochastic map (SM) algorithm, improves over other transport-based methods for conditional sampling by reducing the bias and variance of the associated posterior approximation. We then use the SM algorithm to sequentially estimate the state of a chaotic dynamical system given online observations, a nonlinear filtering problem known in geophysical applications as “data assimilation” (DA). We show that when the SM algorithm is restricted to linear maps, it reduces to the ensemble Kalman filter (EnKF), a workhorse algorithm for DA; with nonlinear updates, however, the SM algorithm substantially improves on the performance of the EnKF in challenging regimes. Finally, we extend the use of transport for high-dimensional inference problems by developing a joint dimension reduction strategy for parameters and observations. We identify relevant low-dimensional projections of these variables by minimizing an information theoretic upper bound on the error in the posterior approximation. We show that this approach reduces to canonical correlation analysis in the linear– Gaussian setting, while outperforming standard dimension reduction strategies in a variety of nonlinear and non-Gaussian inference problems.
Date issued
2022-05
URI
https://hdl.handle.net/1721.1/145049
Department
Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
Publisher
Massachusetts Institute of Technology

Collections
  • Computational Science & Engineering Doctoral Theses (CSE PhD & Dept-CSE PhD)
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.