dc.contributor.advisor | Guttag, John V. | |
dc.contributor.advisor | Matton, Katherine | |
dc.contributor.advisor | Abulnaga, S. Mazdak | |
dc.contributor.author | Balaji, Purvaja | |
dc.date.accessioned | 2025-04-14T14:08:59Z | |
dc.date.available | 2025-04-14T14:08:59Z | |
dc.date.issued | 2025-02 | |
dc.date.submitted | 2025-04-03T14:06:11.325Z | |
dc.identifier.uri | https://hdl.handle.net/1721.1/159150 | |
dc.description.abstract | Phonotraumatic vocal hyperfunction (PVH) is a vocal disorder characterized by damaged vocal folds from excessive or abusive voice use. Clinical assessment of PVH relies on timeconsuming videostroboscopy examination, which poses challenges for large-scale clinical studies. We address the need for more efficient clinical assessment tools by proposing deep learning approaches for automatically detecting PVH severity from stroboscopic images. One of the main challenges in building deep learning models for this task is a lack of labeled stroboscopy data. Motivated by this challenge, we explore two approaches: direct classification and segmentation-then-classification. In the segmentation-then-classification approach, we first train a model to segment the glottis, a clinically relevant part of the vocal fold anatomy. Then, we use the predicted segmentation along with the stroboscopic image as inputs into a classification model. This approach helps to guide the model towards key anatomical features. We achieve up to 0.53 accuracy in four-class PVH severity prediction with the direct classification approach. Incorporating glottal segmentations improves the accuracy to 0.64, underscoring the value of providing anatomically-informed segmentations when assessing PVH severity. By creating an automated PVH severity tool, our work has the potential to help clinicians more efficiently monitor disease progression and to facilitate large-scale screening, thereby contributing to improved patient care. | |
dc.publisher | Massachusetts Institute of Technology | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) | |
dc.rights | Copyright retained by author(s) | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
dc.title | Deep Learning-Based Classification of Phonotraumatic Vocal Hyperfunction Severity from Stroboscopic Images | |
dc.type | Thesis | |
dc.description.degree | M.Eng. | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
mit.thesis.degree | Master | |
thesis.degree.name | Master of Engineering in Electrical Engineering and Computer Science | |