MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • Center for Brains, Minds & Machines
  • Publications
  • CBMM Memo Series
  • View Item
  • DSpace@MIT Home
  • Center for Brains, Minds & Machines
  • Publications
  • CBMM Memo Series
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

A Homogeneous Transformer Architecture

Author(s)
Gan, Yulu; Poggio, Tomaso
Thumbnail
DownloadCBMM-Memo-143.pdf (1.067Mb)
Additional downloads
CBMM Memo 143 v2 (10/21/2024) (1.100Mb)
Metadata
Show full item record
Abstract
While the Transformer architecture has made a substantial impact in the field of machine learning, it is unclear what purpose each component serves in the overall architecture. Heterogeneous nonlinear circuits such as multi-layer RELU networks are interleaved with layers of soft-max units. We introduce here a homogeneous architecture based on Hyper Radial Basis Function (HyperBF) units. Evalua- tions on CIFAR10, CIFAR100, and Tiny ImageNet demonstrate a performance comparable to standard vision transformers.
Date issued
2023-09-18
URI
https://hdl.handle.net/1721.1/152178
Publisher
Center for Brains, Minds and Machines (CBMM)
Series/Report no.
CBMM Memo;143

Collections
  • CBMM Memo Series

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.