Deep Learning-Based Classification of Phonotraumatic Vocal Hyperfunction Severity from Stroboscopic Images

Balaji, Purvaja

dc.contributor.advisor	Guttag, John V.
dc.contributor.advisor	Matton, Katherine
dc.contributor.advisor	Abulnaga, S. Mazdak
dc.contributor.author	Balaji, Purvaja
dc.date.accessioned	2025-04-14T14:08:59Z
dc.date.available	2025-04-14T14:08:59Z
dc.date.issued	2025-02
dc.date.submitted	2025-04-03T14:06:11.325Z
dc.identifier.uri	https://hdl.handle.net/1721.1/159150
dc.description.abstract	Phonotraumatic vocal hyperfunction (PVH) is a vocal disorder characterized by damaged vocal folds from excessive or abusive voice use. Clinical assessment of PVH relies on timeconsuming videostroboscopy examination, which poses challenges for large-scale clinical studies. We address the need for more efficient clinical assessment tools by proposing deep learning approaches for automatically detecting PVH severity from stroboscopic images. One of the main challenges in building deep learning models for this task is a lack of labeled stroboscopy data. Motivated by this challenge, we explore two approaches: direct classification and segmentation-then-classification. In the segmentation-then-classification approach, we first train a model to segment the glottis, a clinically relevant part of the vocal fold anatomy. Then, we use the predicted segmentation along with the stroboscopic image as inputs into a classification model. This approach helps to guide the model towards key anatomical features. We achieve up to 0.53 accuracy in four-class PVH severity prediction with the direct classification approach. Incorporating glottal segmentations improves the accuracy to 0.64, underscoring the value of providing anatomically-informed segmentations when assessing PVH severity. By creating an automated PVH severity tool, our work has the potential to help clinicians more efficiently monitor disease progression and to facilitate large-scale screening, thereby contributing to improved patient care.
dc.publisher	Massachusetts Institute of Technology
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.title	Deep Learning-Based Classification of Phonotraumatic Vocal Hyperfunction Severity from Stroboscopic Images
dc.type	Thesis
dc.description.degree	M.Eng.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Master
thesis.degree.name	Master of Engineering in Electrical Engineering and Computer Science

Files in this item

Name:: balaji-pbalaji-meng-eecs-2025- ...
Size:: 5.263Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record