Show simple item record

dc.contributor.authorGan, Yulu
dc.contributor.authorPoggio, Tomaso
dc.date.accessioned2023-09-19T15:59:39Z
dc.date.available2023-09-19T15:59:39Z
dc.date.issued2023-09-18
dc.identifier.urihttps://hdl.handle.net/1721.1/152178
dc.description.abstractWhile the Transformer architecture has made a substantial impact in the field of machine learning, it is unclear what purpose each component serves in the overall architecture. Heterogeneous nonlinear circuits such as multi-layer RELU networks are interleaved with layers of soft-max units. We introduce here a homogeneous architecture based on Hyper Radial Basis Function (HyperBF) units. Evalua- tions on CIFAR10, CIFAR100, and Tiny ImageNet demonstrate a performance comparable to standard vision transformers.en_US
dc.description.sponsorshipThis material is based upon work supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216.en_US
dc.publisherCenter for Brains, Minds and Machines (CBMM)en_US
dc.relation.ispartofseriesCBMM Memo;143
dc.titleA Homogeneous Transformer Architectureen_US
dc.typeArticleen_US
dc.typeTechnical Reporten_US
dc.typeWorking Paperen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record