Show simple item record

dc.contributor.advisorTomaso Poggio
dc.contributor.authorRifkin, Ryan
dc.contributor.authorBouvrie, Jake
dc.contributor.authorSchutte, Ken
dc.contributor.authorChikkerur, Sharat
dc.contributor.authorKouh, Minjoon
dc.contributor.authorEzzat, Tony
dc.contributor.authorPoggio, Tomaso
dc.contributor.otherCenter for Biological and Computational Learning (CBCL)
dc.date.accessioned2007-02-01T18:26:47Z
dc.date.available2007-02-01T18:26:47Z
dc.date.issued2007-02-01
dc.identifier.otherMIT-CSAIL-TR-2007-007
dc.identifier.otherCBCL-266
dc.identifier.urihttp://hdl.handle.net/1721.1/35835
dc.description.abstractA preliminary set of experiments are described in which a biologically-inspired computer vision system (Serre, Wolf et al. 2005; Serre 2006; Serre, Oliva et al. 2006; Serre, Wolf et al. 2006) designed for visual object recognition was applied to the task of phonetic classification. During learning, the systemprocessed 2-D wideband magnitude spectrograms directly as images, producing a set of 2-D spectrotemporal patch dictionaries at different spectro-temporal positions, orientations, scales, and of varying complexity. During testing, features were computed by comparing the stored patches with patches fromnovel spectrograms. Classification was performed using a regularized least squares classifier (Rifkin, Yeo et al. 2003; Rifkin, Schutte et al. 2007) trained on the features computed by the system. On a 20-class TIMIT vowel classification task, the model features achieved a best result of 58.74% error, compared to 48.57% error using state-of-the-art MFCC-based features trained using the same classifier. This suggests that hierarchical, feed-forward, spectro-temporal patch-based architectures may be useful for phoneticanalysis.
dc.format.extent16 p.
dc.format.extent2265616 bytes
dc.format.extent383591 bytes
dc.format.mimetypeapplication/postscript
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.relation.ispartofseriesMassachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory
dc.relation.isreplacedbyhttp://hdl.handle.net/1721.1/36865
dc.relation.urihttp://hdl.handle.net/1721.1/36865
dc.subjectphonetic classification
dc.subjecthierarchical models
dc.subjectregularized least-squares
dc.subjectspectrotemporal patches
dc.titlePhonetic Classification Using Hierarchical, Feed-forward, Spectro-temporal Patch-based Architectures


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record