MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Optimization Techniques for Trustworthy 3D Object Understanding

Author(s)
Shaikewitz, Lorenzo Franceschini
Thumbnail
DownloadThesis PDF (15.04Mb)
Advisor
Carlone, Luca
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Autonomous machines require reliable 3D object understanding to interpret and interact with their environment. In this thesis, we consider two tightly coupled 3D object understanding problems. Shape estimation seeks a consistent 3D model of an object given sensor data and some set of priors. Pose estimation seeks an estimate of the object’s position and orientation relative to an invariant shape frame. In general, these problems are non-convex and thus difficult to solve. We present algorithms which nonetheless solve shape and pose estimation efficiently and with assurances in terms of of optimality, uncertainty, or latency. We begin in the multi-frame tracking setting, where we propose the certifiably optimal estimator CAST⋆ for simultaneous shape estimation and object tracking. CAST⋆ uses 3D keypoint measurements extracted from an RGB-D image sequence and phrases the estimation as fixed-lag smoothing. Temporal constraints enforce rigidity and continuous motion. Despite the non-convexity of this problem, we solve it to certifiable optimality using a smallsize semidefinite relaxation. We also present a compatibility-based outlier rejection scheme to handle outliers, and evaluate the proposed approach on synthetic and real data. Next, we focus on estimating the pose of an object given its shape and a single RGB image (no depth). Assuming only bounded noise on 2D keypoint measurements (e.g., from conformal prediction), we derive an estimator for the most likely object pose which uses a semidefinite relaxation to initialize a local solver. We pair this with an efficient uncertainty estimation routine which relies on a generalization of the S-Lemma to propagate keypoint uncertainty to high-probability translation and rotation bounds. The high-probability bounds hold regardless of the accuracy of the pose estimate, and are reasonably tight when tested on the LineMOD-Occluded dataset. Lastly, we propose a sub-millisecond solution to simultaneous estimation of object shape and pose from a single RGB-D image. Our approach converts the first-order optimality conditions of the non-convex optimization problem to a nonlinear eigenproblem in the quaternion representation of orientation. We use self-consistent field iteration to efficiently arrive at a local stationary point, finding solutions more than an order of magnitude faster than Gauss-Newton or on-manifold local solvers on synthetically generated data.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/163014
Department
Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.