Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers
Visual Processing in Brains and Models II
Contributed Talk Session: Friday, August 15, 12:00 – 1:00 pm, Room C1.04
Quantifying the Role of Perceived Curvature in the Processing of Natural Object Images
Talk 1, 12:00 pm – Laura Mai Stoinski1, Diego Garcia Cerdas2, Florian P. Mahner1, Parsa Yousefi, Martin N Hebart3; 1Max Planck Institute for Human cognition and brain sciences, Max-Planck Institute, 2University of Amsterdam, 3Justus Liebig Universität Gießen
Presenter: Laura Mai Stoinski
Curvature has been suggested as a fundamental organizational dimension of object responses. Despite its prominence, there is no consensus on how to define this measure for naturalistic object images. Here, we aimed to quantify the perceived curvature of natural images, clarify its relationship to spatial and temporal patterns of brain activity, and identify what features in an image contribute to perceived curvature. To address this, we collected extensive curvature ratings of 27,961 natural images and tested how they explain neural responses compared to computed curvature measures. Leveraging large-scale fMRI and MEG datasets, perceived curvature best explained broad occipitotemporal patterns in fMRI data and was decodable across an extended time period in MEG. To identify the object-features contributing to people’s perception of curvature, we used an image-generative approach based on deep neural networks, suggesting that people considered the curvature of more global object contours in their judgements. Given the apparent validity of perceived curvature, we offer an image-computable model to quantify perceived curvature for novel images. Together, our results highlight the importance of perceived curvature as a mid-level summary statistic and provide an approach for the automated quantification of perceived curvature in natural object images.
Shared high-dimensional latent structure in the neural and mental representations of objects
Talk 2, 12:10 pm – Raj Magesh Gauthaman1, Michael Bonner1; 1Johns Hopkins University
Presenter: Raj Magesh Gauthaman
Recent work has demonstrated that visual cortex representations of natural scenes are high-dimensional, with a power-law spectrum of stimulus-related variance. However, the statistical structure of the mental representations underlying visual behavior remains unknown — is there a limited subset of latent dimensions that fully captures human behavior on a visual task? Here, we investigate the dimensionality of visual object representations in the human mind and brain by analyzing behavioral and fMRI responses from the large-scale THINGS-data collection using spectral decomposition methods. First, we find that neural representations of objects have a high-dimensional power-law structure throughout visual cortex, replicating previous findings for natural scenes. Next, we show that mental representations of objects, inferred directly from human similarity judgments, have an underlying power-law covariance spectrum, consistent with the power-law structure observed in neural representations of these stimuli. Finally, we show that the dimensionality of shared mental and neural representations increases systematically over stages of visual processing from V1 to hV4 to LOC. Our results suggest that a shared high-dimensional latent structure underlies both mental and neural representations of objects.
A 7T fMRI dataset of synthetic images for out-of-distribution modeling of vision
Talk 3, 12:20 pm – Alessandro Thomas Gifford1, Radoslaw Martin Cichy1, Thomas Naselaris2, Kendrick Kay3; 1Freie Universität Berlin, 2University of Minnesota, Minneapolis, 3University of Minnesota - Twin Cities
Presenter: Alessandro Thomas Gifford
Large-scale visual neural datasets such as the Natural Scenes Dataset (NSD) are boosting NeuroAI research by enabling computational models of the brain with performances beyond what was possible just a decade ago. However, these datasets lack out-of-distribution (OOD) components, which are crucial for the development of more robust models. Here, we address this limitation by releasing NSD-synthetic, a dataset consisting of 7T fMRI responses from the eight NSD subjects for 284 carefully controlled synthetic images. We show that NSD-synthetic’s fMRI responses are OOD with respect to NSD, that brain encoding models exhibit reduced performance when tested OOD on NSD-synthetic compared to when tested in-distribution (ID) on NSD, and that OOD tests on NSD-synthetic reveal differences between encoding models not detected by ID tests—specifically, self-supervised deep neural networks better explain neural responses than their task-supervised counterparts. These results showcase how NSD-synthetic enables OOD generalization tests that facilitate the development of more robust models of visual processing, and the formulation of more accurate theories of human vision.
Encoding of Fixation-Specific Visual Information: No Evidence of Information Carry-Over between Fixations
Talk 4, 12:30 pm – Carmen Amme1, Philip Sulewski2, Malin Braatz1, Peter König1, Tim C Kietzmann1; 1Universität Osnabrück, 2Institute of Cognitive Science, Osnabrück University, Universität Osnabrück
Presenter: Carmen Amme
Humans make multiple eye movements each second to sample visual information from different locations in space. This information is integrated to form a single, coherent percept. Here, we investigate whether fixation-specific visual information is encoded in the corresponding neural data, and whether this information is carried over to the subsequent fixation. We successfully encoded fixation-specific neural responses using deep neural network features extracted from fixation patches, with encoding performance peaking at around 125 ms after fixation onset in occipital sensors. Encoding in source space revealed a peak encoding performance along the left dorsal visual path at 100 ms. Incorporating model representations from both the previous and current fixation patches did not improve encoding performance, suggesting no carry-over of visual information between fixations. By demonstrating the feasibility of encoding naturalistic stimulus features during active vision in humans, we open new avenues for investigating how the brain constructs coherent percepts despite processing visual information in discrete, fixation-specific fragments.
A hierarchy of spatial predictions across human visual cortex during natural vision
Talk 5, 12:40 pm – Wieger H. Scheurer1, Micha Heilbron2; 1Donders Institute for Brain, Cognition and Behaviour, 2University of Amsterdam
Presenter: Wieger H. Scheurer
The predictive processing framework posits that the brain constantly compares incoming sensory signals with self-generated predictions. However, evidence for prediction in sensory cortex mostly comes from artificial paradigms with simple, highly predictable stimuli, leaving it unclear whether the reported sensory prediction effects generalise to perception more broadly. Here, we probe predictions in naturalistic perception, analysing high-resolution 7T functional magnetic resonance imaging (fMRI) responses of human participants viewing tens of thousands natural scenes. We use deep generative models to quantify the inherent spatial predictability of image patches, and relate resulting estimates to brain activity. Our results reveal robust and widespread predictability modulations of BOLD responses across the visual cortex. Higher visual areas were sensitive to more high-level predictability, forming a prediction hierarchy. Effects were stronger in voxels with higher eccentricity receptive fields, aligning with predictive coding and Bayesian theories. These results demonstrate the ubiquity of prediction in vision and inform neurocomputational models of predictive coding and self-supervised learning in the brain.