Show simple item record

dc.contributor.advisorGuttag, John V.
dc.contributor.advisorMatton, Katherine
dc.contributor.advisorAbulnaga, S. Mazdak
dc.contributor.authorBalaji, Purvaja
dc.date.accessioned2025-04-14T14:08:59Z
dc.date.available2025-04-14T14:08:59Z
dc.date.issued2025-02
dc.date.submitted2025-04-03T14:06:11.325Z
dc.identifier.urihttps://hdl.handle.net/1721.1/159150
dc.description.abstractPhonotraumatic vocal hyperfunction (PVH) is a vocal disorder characterized by damaged vocal folds from excessive or abusive voice use. Clinical assessment of PVH relies on timeconsuming videostroboscopy examination, which poses challenges for large-scale clinical studies. We address the need for more efficient clinical assessment tools by proposing deep learning approaches for automatically detecting PVH severity from stroboscopic images. One of the main challenges in building deep learning models for this task is a lack of labeled stroboscopy data. Motivated by this challenge, we explore two approaches: direct classification and segmentation-then-classification. In the segmentation-then-classification approach, we first train a model to segment the glottis, a clinically relevant part of the vocal fold anatomy. Then, we use the predicted segmentation along with the stroboscopic image as inputs into a classification model. This approach helps to guide the model towards key anatomical features. We achieve up to 0.53 accuracy in four-class PVH severity prediction with the direct classification approach. Incorporating glottal segmentations improves the accuracy to 0.64, underscoring the value of providing anatomically-informed segmentations when assessing PVH severity. By creating an automated PVH severity tool, our work has the potential to help clinicians more efficiently monitor disease progression and to facilitate large-scale screening, thereby contributing to improved patient care.
dc.publisherMassachusetts Institute of Technology
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.titleDeep Learning-Based Classification of Phonotraumatic Vocal Hyperfunction Severity from Stroboscopic Images
dc.typeThesis
dc.description.degreeM.Eng.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Engineering in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record