Machine Learning Algorithms and Applications in Health Care

Sobiesk, Matthew

dc.contributor.advisor	Bertsimas, Dimitris
dc.contributor.author	Sobiesk, Matthew
dc.date.accessioned	2022-06-15T13:09:42Z
dc.date.available	2022-06-15T13:09:42Z
dc.date.issued	2022-02
dc.date.submitted	2022-01-06T00:08:54.199Z
dc.identifier.uri	https://hdl.handle.net/1721.1/143285
dc.description.abstract	There have been many recent advances in machine learning, resulting in models which have had major impact in a variety of disciplines. Some of the best performing models are black boxes, which are not directly interpretable by humans. However, in some applications such as health care it is vital to use interpretable models to understand why the model is making its predictions, to ensure that using them to inform decision making will not unexpectedly harm the people it should instead be helping. This leads to the question of whether a trade off between predictive accuracy and interpretability exists, and how we can improve interpretable models' performances to reduce such trade offs if they do. In the first chapter, we show that optimal decision trees are equivalent in terms of modeling power to neural networks. Specifically, given a neural network (feedforward, convolutional, or recurrent), we construct a decision tree with hyperplane splits that has identical in-sample performance. Building on previous research showing that given a decision tree, we can construct a feedforward neural network with the same in-sample performance, we prove the two methods are equivalent. We further compare decision trees and neural networks empirically on data from the UCI Machine Learning Repository and find that they have comparable performance. In the second chapter, we propose a new machine learning method called Optimal Predictive Clustering (OPC). The method uses optimization with strong warm starts to simultaneously cluster data points and learn cluster-specific logistic regression models. It is designed to combine strong predictive performance, scalability, and interpretability. We then empirically compare OPC to a wide variety of other methods such as Optimal Regression Trees with Linear Predictors (ORT-L) and XGBoost. We find that our method performs on par with cutting edge interpretable methods, and that it enhances an ensemble of methods to achieve the best out-of-sample performance across all models. In the third chapter, we predict one year transplant outcomes for lung, liver, and kidney data to investigate whether predicted post-transplant outcomes should be included in the organ allocation system of organs other than lungs. We find that the models do not differentiate one-year graft survival or failure outcomes effectively enough to be useful components of the organ allocation process. We then theorize about possible reasons for this failure, including the actual transplant procedure having a large effect on the one-year graft outcome or the potential need for additional data, like genetic information.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright MIT
dc.rights.uri	http://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Machine Learning Algorithms and Applications in Health Care
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Operations Research Center
dc.identifier.orcid	0000-0002-7949-9013
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: Sobiesk-msobiesk-phd-orc-2022- ...
Size:: 1.452Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record