Browsing CBMM Memo Series by Title

Parsing Occluded People by Flexible Compositions

Chen, Xianjie; Yuille, Alan L. (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-06-01)

This paper presents an approach to parsing humans when there is significant occlusion. We model humans using a graphical model which has a tree structure building on recent work [32, 6] and exploit the connectivity prior ...

Parsing Semantic Parts of Cars Using Graphical Models and Segment Appearance Consistency

Lu, Wenhao; Lian, Xiaochen; Yuille, Alan L. (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-13)

This paper addresses the problem of semantic part parsing (segmentation) of cars, i.e.assigning every pixel within the car to one of the parts (e.g.body, window, lights, license plates and wheels). We formulate this as a ...

PCA as a defense against some adversaries

Aparne, Gupta; Banburski, Andrzej; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2022-03-30)

Neural network classifiers are known to be highly vulnerable to adversarial perturbations in their inputs. Under the hypothesis that adversarial examples lie outside of the sub-manifold of natural images, previous work has ...

PHASE: PHysically-grounded Abstract Social Events for Machine Social Perception

Netanyahu, Aviv; Shu, Tianmin; Katz, Boris; Barbu, Andrei; Tenenbaum, Joshua B. (Center for Brains, Minds and Machines (CBMM), The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), 2021, 2021-03-19)

The ability to perceive and reason about social interactions in the context of physical environments is core to human social intelligence and human-machine cooperation. However, no prior dataset or benchmark has ...

Predicting Actions Before They Occur

Vaziri-Pashkam, Maryam; Cormiea, Sarah; Nakayama, Ken (Center for Brains, Minds and Machines (CBMM), 2015-10-26)

Humans are experts at reading others’ actions in social contexts. They efficiently process others’ movements in real-time to predict intended goals. Here we designed a two-person reaching task to investigate real-time body ...

Probing the compositionality of intuitive functions

Schulz, Eric; Tenenbaum, Joshua B.; Duvenaud, David; Speekenbrink, Maarten; Gershman, Samuel J. (Center for Brains, Minds and Machines (CBMM), 2016-05-26)

How do people learn about complex functional structure? Taking inspiration from other areas of cognitive science, we propose that this is accomplished by harnessing compositionality: complex structure is decomposed into ...

Reconstructing Native Language Typology from Foreign Language Usage

Berzak, Yevgeni; Reichart, Roi; Katz, Boris (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-04-25)

Linguists and psychologists have long been studying cross-linguistic transfer, the influence of native language properties on linguistic performance in a foreign language. In this work we provide empirical evidence for ...

Recurrent Multimodal Interaction for Referring Image Segmentation

Liu, Chenxi; Lin, Zhe; Shen, Xiaohui; Yang, Jimei; Lu, Xin; e.a. (Center for Brains, Minds and Machines (CBMM), 2018-05-10)

In this paper we are interested in the problem of image segmentation given natural language descriptions, i.e. referring expressions. Existing works tackle this problem by first modeling images and sentences independently ...

Representation Learning in Sensory Cortex: a theory

Anselmi, Fabio; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2014-11-14)

We review and apply a computational theory of the feedforward path of the ventral stream in visual cortex based on the hypothesis that its main function is the encoding of invariant representations of images. A key ...

A Review of Relational Machine Learning for Knowledge Graphs

Nickel, Maximilian; Murphy, Kevin; Tresp, Volker; Gabrilovich, Evgeniy (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-03-23)

Relational machine learning studies methods for the statistical analysis of relational, or graph-structured, data. In this paper, we provide a review of how such statistical models can be “trained” on large knowledge graphs, ...

Robust Estimation of 3D Human Poses from a Single Image

Wang, Chunyu; Wang, Yizhou; Lin, Zhouchen; Yuille, Alan L.; Gao, Wen (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-10)

Human pose estimation is a key step to action recognition. We propose a method of estimating 3D human poses from a single image, which works in conjunction with an existing 2D pose/joint detector. 3D pose estimation is ...

A role for recurrent processing in object completion: neurophysiological, psychophysical and computational evidence.

Tang, Hanlin; Buia, Calin; Madsen, Joseph R.; Anderson, William S.; Kreiman, Gabriel (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-04-26)

Recognition of objects from partial information presents a significant challenge for theories of vision because it requires spatial integration and extrapolation from prior knowledge. We combined neurophysiological recordings ...

Scene Graph Parsing as Dependency Parsing

Wang, Yu-Siang; Liu, Chenxi; Zeng, Xiaohui; Yuille, Alan L. (Center for Brains, Minds and Machines (CBMM), 2018-05-10)

In this paper, we study the problem of parsing structured knowledge graphs from textual descrip- tions. In particular, we consider the scene graph representation that considers objects together with their attributes and ...

The Secrets of Salient Object Segmentation

Li, Yin; Hou, Xiaodi; Koch, Christof; Rehg, James M.; Yuille, Alan L. (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-13)

In this paper we provide an extensive evaluation of fixation prediction and salient object segmentation algorithms as well as statistics of major datasets. Our analysis identifies serious design flaws of existing salient ...

Seeing is Worse than Believing: Reading People’s Minds Better than Computer-Vision Methods Recognize Actions

Barbu, Andrei; Barrett, Daniel P.; Chen, Wei; Narayanaswamy, Siddharth; Xiong, Caiming; e.a. (2015-12-10)

We had human subjects perform a one-out-of-six class action recognition task from video stimuli while undergoing functional magnetic resonance imaging (fMRI). Support-vector machines (SVMs) were trained on the recovered ...

Seeing What You’re Told: Sentence-Guided Activity Recognition In Video

Siddharth, Narayanaswamy; Barbu, Andrei; Siskind, Jeffrey Mark (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-05-29)

We present a system that demonstrates how the compositional structure of events, in concert with the compositional structure of language, can interplay with the underlying focusing mechanisms in video action recognition, ...

Semantic Part Segmentation using Compositional Model combining Shape and Appearance

Wang, Jianyu; Yuille, Alan L. (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-06-08)

In this paper, we study the problem of semantic part segmentation for animals. This is more challenging than standard object detection, object segmentation and pose estimation tasks because semantic parts of animals often ...

Sensitivity to Timing and Order in Human Visual Cortex.

Singer, Jedediah M.; Madsen, Joseph R.; Anderson, William S.; Kreiman, Gabriel (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-04-25)

Visual recognition takes a small fraction of a second and relies on the cascade of signals along the ventral visual stream. Given the rapid path through multiple processing steps between photoreceptors and higher visual ...

SGD and Weight Decay Provably Induce a Low-Rank Bias in Deep Neural Networks

Galanti, Tomer; Siegel, Zachary; Gupte, Aparna; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2023-02-14)

In this paper, we study the bias of Stochastic Gradient Descent (SGD) to learn low-rank weight matrices when training deep ReLU neural networks. Our results show that training neural networks with mini-batch SGD and weight ...

SGD Noise and Implicit Low-Rank Bias in Deep Neural Networks

Galanti, Tomer; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2022-03-28)

We analyze deep ReLU neural networks trained with mini-batch stochastic gradient decent and weight decay. We prove that the source of the SGD noise is an implicit low rank constraint across all of the weight matrices within ...