dc.contributor.author | Gan, Yulu | |
dc.contributor.author | Poggio, Tomaso | |
dc.date.accessioned | 2023-09-19T15:59:39Z | |
dc.date.available | 2023-09-19T15:59:39Z | |
dc.date.issued | 2023-09-18 | |
dc.identifier.uri | https://hdl.handle.net/1721.1/152178 | |
dc.description.abstract | While the Transformer architecture has made a substantial impact in the field of machine learning, it is unclear what purpose each component serves in the overall architecture. Heterogeneous nonlinear circuits such as multi-layer RELU networks are interleaved with layers of soft-max units. We introduce here a homogeneous architecture based on Hyper Radial Basis Function (HyperBF) units. Evalua- tions on CIFAR10, CIFAR100, and Tiny ImageNet demonstrate a performance comparable to standard vision transformers. | en_US |
dc.description.sponsorship | This material is based upon work supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216. | en_US |
dc.publisher | Center for Brains, Minds and Machines (CBMM) | en_US |
dc.relation.ispartofseries | CBMM Memo;143 | |
dc.title | A Homogeneous Transformer Architecture | en_US |
dc.type | Article | en_US |
dc.type | Technical Report | en_US |
dc.type | Working Paper | en_US |