Browsing CBMM Memo Series by Title

Scene Graph Parsing as Dependency Parsing

Wang, Yu-Siang; Liu, Chenxi; Zeng, Xiaohui; Yuille, Alan L. (Center for Brains, Minds and Machines (CBMM), 2018-05-10)

In this paper, we study the problem of parsing structured knowledge graphs from textual descrip- tions. In particular, we consider the scene graph representation that considers objects together with their attributes and ...

The Secrets of Salient Object Segmentation

Li, Yin; Hou, Xiaodi; Koch, Christof; Rehg, James M.; Yuille, Alan L. (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-13)

In this paper we provide an extensive evaluation of fixation prediction and salient object segmentation algorithms as well as statistics of major datasets. Our analysis identifies serious design flaws of existing salient ...

Seeing is Worse than Believing: Reading People’s Minds Better than Computer-Vision Methods Recognize Actions

Barbu, Andrei; Barrett, Daniel P.; Chen, Wei; Narayanaswamy, Siddharth; Xiong, Caiming; e.a. (2015-12-10)

We had human subjects perform a one-out-of-six class action recognition task from video stimuli while undergoing functional magnetic resonance imaging (fMRI). Support-vector machines (SVMs) were trained on the recovered ...

Seeing What You’re Told: Sentence-Guided Activity Recognition In Video

Siddharth, Narayanaswamy; Barbu, Andrei; Siskind, Jeffrey Mark (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-05-29)

We present a system that demonstrates how the compositional structure of events, in concert with the compositional structure of language, can interplay with the underlying focusing mechanisms in video action recognition, ...

Semantic Part Segmentation using Compositional Model combining Shape and Appearance

Wang, Jianyu; Yuille, Alan L. (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-06-08)

In this paper, we study the problem of semantic part segmentation for animals. This is more challenging than standard object detection, object segmentation and pose estimation tasks because semantic parts of animals often ...

Sensitivity to Timing and Order in Human Visual Cortex.

Singer, Jedediah M.; Madsen, Joseph R.; Anderson, William S.; Kreiman, Gabriel (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-04-25)

Visual recognition takes a small fraction of a second and relies on the cascade of signals along the ventral visual stream. Given the rapid path through multiple processing steps between photoreceptors and higher visual ...

SGD and Weight Decay Provably Induce a Low-Rank Bias in Deep Neural Networks

Galanti, Tomer; Siegel, Zachary; Gupte, Aparna; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2023-02-14)

In this paper, we study the bias of Stochastic Gradient Descent (SGD) to learn low-rank weight matrices when training deep ReLU neural networks. Our results show that training neural networks with mini-batch SGD and weight ...

SGD Noise and Implicit Low-Rank Bias in Deep Neural Networks

Galanti, Tomer; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2022-03-28)

We analyze deep ReLU neural networks trained with mini-batch stochastic gradient decent and weight decay. We prove that the source of the SGD noise is an implicit low rank constraint across all of the weight matrices within ...

Simultaneous whole‐animal 3D imaging of neuronal activity using light‐field microscopy

Prevedel, Robert; Yoon, Young-Gyu; Hoffman, Maximilian; Pak, Nikita; Wetzstein, Gordon; e.a. (Center for Brains, Minds and Machines (CBMM), 2014-05-18)

High-speed, large-scale three-dimensional (3D) imaging of neuronal activity poses a major challenge in neuroscience. Here we demonstrate simultaneous functional imaging of neuronal activity at single-neuron resolution in ...

Single units in a deep neural network functionally correspond with neurons in the brain: preliminary results

Arend, Luke; Han, Yena; Schrimpf, Martin; Bashivan, Pouya; Kar, Kohitij; e.a. (Center for Brains, Minds and Machines (CBMM), 2018-11-02)

Deep neural networks have been shown to predict neural responses in higher visual cortex. The mapping from the model to a neuron in the brain occurs through a linear combination of many units in the model, leaving open the ...

Single-Shot Object Detection with Enriched Semantics

Zhang, Zhishuai; Qiao, Siyuan; Xie, Cihang; Shen, Wei; Wang, Bo; e.a. (Center for Brains, Minds and Machines (CBMM), 2018-06-19)

We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic ...

Skip Connections Increase the Capacity of Associative Memories in Variable Binding Mechanisms

Xie, Yi; Li, Yichen; Rangamani, Akshay (Center for Brains, Minds and Machines (CBMM), 2023-06-27)

The flexibility of intelligent behavior is fundamentally attributed to the ability to separate and assign structural information from content in sensory inputs. Variable binding is the atomic computation that underlies ...

Social Interactions as Recursive MDPs

Tejwani, Ravi; Kuo, Yen-Ling; Shu, Tianmin; Katz, Boris; Barbu, Andrei (Center for Brains, Minds and Machines (CBMM), Conference on Robot Learning (CoRL), 2021-11-08)

While machines and robots must interact with humans, providing them with social skills has been a largely overlooked topic. This is mostly a consequence of the fact that tasks such as navigation, command following, and ...

Spatiotemporal interpretation features in the recognition of dynamic images

Ben-Yosef, Guy; Kreiman, Gabriel; Ullman, Shimon (Center for Brains, Minds and Machines (CBMM), 2018-11-21)

Objects and their parts can be visually recognized and localized from purely spatial information in static images and also from purely temporal information as in the perception of biological motion. Cortical regions have ...

Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset

Palmer, Ian; Rouditchenko, Andrew; Barbu, Andrei; Katz, Boris; Glass, James (Center for Brains, Minds and Machines (CBMM), The 22nd Annual Conference of the International Speech Communication Association (Interspeech), 2021-08-30)

Visually-grounded spoken language datasets can enable models to learn cross-modal correspon- dences with very weak supervision. However, modern audio-visual datasets contain biases that un- dermine the real-world performance ...

Stable Foundations for Learning: a foundational framework for learning theory in both the classical and modern regime.

Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2020-03-25)

We consider here the class of supervised learning algorithms known as Empirical Risk Minimization (ERM). The classical theory by Vapnik and others characterize universal consistency of ERM in the classical regime in which ...

Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning

Liao, Qianli; Kawaguchi, Kenji; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2016-10-19)

We systematically explored a spectrum of normalization algorithms related to Batch Normalization (BN) and propose a generalized formulation that simultaneously solves two major limitations of BN: (1) online learning and ...

MIT Libraries homeDSpace@MIT