Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers

Visual Processing in Brains and Models II

Contributed Talk Session: Friday, August 15, 12:00 – 1:00 pm, Room C1.04

Quantifying the Role of Perceived Curvature in the Processing of Natural Object Images

Talk 1, 12:00 pm – Laura Mai Stoinski¹, Diego Garcia Cerdas², Florian P. Mahner¹, Parsa Yousefi, Martin N Hebart³; ¹Max Planck Institute for Human cognition and brain sciences, Max-Planck Institute, ²University of Amsterdam, ³Justus Liebig Universität Gießen

Presenter: Laura Mai Stoinski

Curvature has been suggested as a fundamental organizational dimension of object responses. Despite its prominence, there is no consensus on how to define this measure for naturalistic object images. Here, we aimed to quantify the perceived curvature of natural images, clarify its relationship to spatial and temporal patterns of brain activity, and identify what features in an image contribute to perceived curvature. To address this, we collected extensive curvature ratings of 27,961 natural images and tested how they explain neural responses compared to computed curvature measures. Leveraging large-scale fMRI and MEG datasets, perceived curvature best explained broad occipitotemporal patterns in fMRI data and was decodable across an extended time period in MEG. To identify the object-features contributing to people’s perception of curvature, we used an image-generative approach based on deep neural networks, suggesting that people considered the curvature of more global object contours in their judgements. Given the apparent validity of perceived curvature, we offer an image-computable model to quantify perceived curvature for novel images. Together, our results highlight the importance of perceived curvature as a mid-level summary statistic and provide an approach for the automated quantification of perceived curvature in natural object images.

Full Text PDF

Shared high-dimensional latent structure in the neural and mental representations of objects

Talk 2, 12:10 pm – Raj Magesh Gauthaman¹, Michael Bonner¹; ¹Johns Hopkins University

Presenter: Raj Magesh Gauthaman

Recent work has demonstrated that visual cortex representations of natural scenes are high-dimensional, with a power-law spectrum of stimulus-related variance. However, the statistical structure of the mental representations underlying visual behavior remains unknown — is there a limited subset of latent dimensions that fully captures human behavior on a visual task? Here, we investigate the dimensionality of visual object representations in the human mind and brain by analyzing behavioral and fMRI responses from the large-scale THINGS-data collection using spectral decomposition methods. First, we find that neural representations of objects have a high-dimensional power-law structure throughout visual cortex, replicating previous findings for natural scenes. Next, we show that mental representations of objects, inferred directly from human similarity judgments, have an underlying power-law covariance spectrum, consistent with the power-law structure observed in neural representations of these stimuli. Finally, we show that the dimensionality of shared mental and neural representations increases systematically over stages of visual processing from V1 to hV4 to LOC. Our results suggest that a shared high-dimensional latent structure underlies both mental and neural representations of objects.

Full Text PDF

A 7T fMRI dataset of synthetic images for out-of-distribution modeling of vision

Talk 3, 12:20 pm – Alessandro Thomas Gifford¹, Radoslaw Martin Cichy¹, Thomas Naselaris², Kendrick Kay³; ¹Freie Universität Berlin, ²University of Minnesota, Minneapolis, ³University of Minnesota - Twin Cities

Presenter: Alessandro Thomas Gifford

Large-scale visual neural datasets such as the Natural Scenes Dataset (NSD) are boosting NeuroAI research by enabling computational models of the brain with performances beyond what was possible just a decade ago. However, these datasets lack out-of-distribution (OOD) components, which are crucial for the development of more robust models. Here, we address this limitation by releasing NSD-synthetic, a dataset consisting of 7T fMRI responses from the eight NSD subjects for 284 carefully controlled synthetic images. We show that NSD-synthetic’s fMRI responses are OOD with respect to NSD, that brain encoding models exhibit reduced performance when tested OOD on NSD-synthetic compared to when tested in-distribution (ID) on NSD, and that OOD tests on NSD-synthetic reveal differences between encoding models not detected by ID tests—specifically, self-supervised deep neural networks better explain neural responses than their task-supervised counterparts. These results showcase how NSD-synthetic enables OOD generalization tests that facilitate the development of more robust models of visual processing, and the formulation of more accurate theories of human vision.

Full Text PDF

Encoding of Fixation-Specific Visual Information: No Evidence of Information Carry-Over between Fixations

Talk 4, 12:30 pm – Carmen Amme¹, Philip Sulewski², Malin Braatz¹, Peter König¹, Tim C Kietzmann¹; ¹Universität Osnabrück, ²Institute of Cognitive Science, Osnabrück University, Universität Osnabrück

Presenter: Carmen Amme

Humans make multiple eye movements each second to sample visual information from different locations in space. This information is integrated to form a single, coherent percept. Here, we investigate whether fixation-specific visual information is encoded in the corresponding neural data, and whether this information is carried over to the subsequent fixation. We successfully encoded fixation-specific neural responses using deep neural network features extracted from fixation patches, with encoding performance peaking at around 125 ms after fixation onset in occipital sensors. Encoding in source space revealed a peak encoding performance along the left dorsal visual path at 100 ms. Incorporating model representations from both the previous and current fixation patches did not improve encoding performance, suggesting no carry-over of visual information between fixations. By demonstrating the feasibility of encoding naturalistic stimulus features during active vision in humans, we open new avenues for investigating how the brain constructs coherent percepts despite processing visual information in discrete, fixation-specific fragments.

Full Text PDF

A hierarchy of spatial predictions across human visual cortex during natural vision

Talk 5, 12:40 pm – Wieger H. Scheurer¹, Micha Heilbron²; ¹Donders Institute for Brain, Cognition and Behaviour, ²University of Amsterdam

Presenter: Wieger H. Scheurer

The predictive processing framework posits that the brain constantly compares incoming sensory signals with self-generated predictions. However, evidence for prediction in sensory cortex mostly comes from artificial paradigms with simple, highly predictable stimuli, leaving it unclear whether the reported sensory prediction effects generalise to perception more broadly. Here, we probe predictions in naturalistic perception, analysing high-resolution 7T functional magnetic resonance imaging (fMRI) responses of human participants viewing tens of thousands natural scenes. We use deep generative models to quantify the inherent spatial predictability of image patches, and relate resulting estimates to brain activity. Our results reveal robust and widespread predictability modulations of BOLD responses across the visual cortex. Higher visual areas were sensitive to more high-level predictability, forming a prediction hierarchy. Effects were stronger in voxels with higher eccentricity receptive fields, aligning with predictive coding and Bayesian theories. These results demonstrate the ubiquity of prediction in vision and inform neurocomputational models of predictive coding and self-supervised learning in the brain.

Full Text PDF