| dc.contributor.advisor | Shattuck-Hufnagel, Stefanie |  | 
| dc.contributor.author | Park, Janette H. |  | 
| dc.date.accessioned | 2025-10-06T17:38:30Z |  | 
| dc.date.available | 2025-10-06T17:38:30Z |  | 
| dc.date.issued | 2025-05 |  | 
| dc.date.submitted | 2025-06-23T14:03:13.330Z |  | 
| dc.identifier.uri | https://hdl.handle.net/1721.1/162991 |  | 
| dc.description.abstract | This study presents a framework for the automatic detection of the eight landmark acoustic cues in human speech. Landmarks are key articulatory events, produced as a result of minimal vocal tract constriction (e.g., vowels and glides) or closures and releases in the oral region (e.g., nasal, fricative, and stop consonants). A complete landmark detection system is a key step towards an overarching speech analysis system that relies on lexical acoustic cues, as landmarks guide the identification of other acoustic cues in speech. In the proposed framework, the acoustic properties of each of the eight landmark cues are modeled by extracting speech-related measurements and training Gaussian Mixture Models (GMMs). To remove the effects of speaker variability and different recording environments, methods for normalizing speech-related measurements are proposed and evaluated. For a new speech signal, the normalized speech-related measurements are extracted at each time frame and evaluated against the eight trained GMMs to compute the likelihood of each landmark. Using Bayes’ Theorem, the posterior probabilities are calculated to determine the most probable landmark (or absence thereof) at each time frame. The system’s performance is evaluated by comparing the detected landmarks to the manually labeled ground truth landmark annotations. |  | 
| dc.publisher | Massachusetts Institute of Technology |  | 
| dc.rights | In Copyright - Educational Use Permitted |  | 
| dc.rights | Copyright retained by author(s) |  | 
| dc.rights.uri | https://rightsstatements.org/page/InC-EDU/1.0/ |  | 
| dc.title | Automatic Detection of Landmark Acoustic Cues in
Human Speech |  | 
| dc.type | Thesis |  | 
| dc.description.degree | M.Eng. |  | 
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |  | 
| mit.thesis.degree | Master |  | 
| thesis.degree.name | Master of Engineering in Electrical Engineering and Computer Science |  |