Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers
Audition and Language
Contributed Talk Session: Thursday, August 14, 11:00 am – 12:00 pm, Room C1.03
A Two-Dimensional Space of Linguistic Representations Shared Across Individuals
Talk 1, 11:00 am – Greta Tuckute1, Elizabeth J. Lee1, Yongtian Ou2, Evelina Fedorenko1, Kendrick Kay2; 1Massachusetts Institute of Technology, 2University of Minnesota - Twin Cities
Presenter: Greta Tuckute
Humans learn and use language in diverse ways, yet all typically developing individuals acquire at least one language and use it to communicate complex ideas. This fundamental ability raises a key question: Which dimensions of language processing are shared across brains, and how are these dimensions organized in the human cortex? To address these questions, we collected ultra-high-field (7T) fMRI data while eight participants listened to 200 linguistically diverse sentences. To identify main components of variance in the sentence-evoked brain responses, we performed data decomposition and systematically tested which components generalize across individuals. Only two shared components emerged robustly, together accounting for about 32% of the explainable variance. Analysis of linguistic feature preferences showed that the first component corresponds to processing difficulty, and the second—to meaning abstractness. Both components are spatially distributed across frontal and temporal areas associated with language processing but, surprisingly, also extended into the ventral visual cortex. These findings reveal a low-dimensional, spatially structured representational basis for language processing shared across humans.
Disentangling of Spoken Words and Talker Identity in Human Auditory Cortex
Talk 2, 11:10 am – Akhil Bandreddi1, Dana Boebinger1, David Skrill1, Kirill V Nourski2, Matthew Howard, Christopher Garcia, Thomas Wychowski, Webster Pilcher, Samuel Victor Norman-Haignere3; 1University of Rochester, 2The University of Iowa, 3University of Rochester Medical Center
Presenter: Akhil Bandreddi
Complex natural sounds such as speech contain many different types of information, but recognizing these distinct information sources is computationally challenging because sounds with shared information differ widely in their acoustics. For example, variation across talkers makes it challenging to recognize the identity of a word, while variation in the acoustics of different words makes it challenging to recognize talker identity. How does the human auditory cortex disentangle word identity from talker identity such that each type of information can be coded invariant to acoustic variation all other information sources? To address this question, we measured neural responses to a diverse set of 338 words spoken by 32 different talkers using spatiotemporally precise intracranial recordings from the human auditory cortex. We developed a simple set of model-free experimental metrics for quantifying representational disentangling of word and talker identity, both within individual electrodes as well as across different dimensions of the neural population response. We observed individual electrodes that show a representation of words that is partially robust to acoustic variation in talker identity, but no electrodes or brain regions showed a robust representation of talker identity. However, at the population level, we observed distinct dimensions of the neural response that nearly exclusively reflected either words or talker identity, and were completely invariant to acoustic variation in the non-target dimension. These results suggest that while there is partial specialization for talker-robust word identity in localized brain regions, robust disentangling is accomplished at the population level with distinct representations of words and talker identity mapped to distinct dimensions of the neural code for speech.
Decomposition of uncertainty into dispersion and strength during speech processing
Talk 3, 11:20 am – Pierre Guilleminot1, Benjamin Morillon; 1Université d'Aix-Marseille
Presenter: Pierre Guilleminot
Speech comprehension relies on contextual predictions to minimize error between expected and incoming auditory input. This mechanism hinges on linguistic uncertainty, typically quantified using Shannon entropy, which captures the overall unpredictability of a given word based on prior context. However, Shannon entropy is only one member of the broader Rényi entropy family, which enables a distinction between uncertainty due to the strength of dominant predictions and that due to the dispersion across alternatives. Here, we investigated which entropy-based measure of uncertainty best reflects neural processing during speech listening. Using intracortical recordings from subjects listening to an audiobook, we computed the mutual information between neural responses and different uncertainty measures derived from a large language model. Our results reveal that, rather than Shannon entropy, the brain separately processes the strength and dispersion of predictions in distinct neural populations. This suggests a multidimensional representation of uncertainty. More generally, these findings highlight the need for a more refined definition of uncertainty in cognitive neuroscience.
Pupil-linked arousal tracks belief updating in a dynamic auditory enviroment
Talk 4, 11:30 am – Lars Kopel1, Evi van Gastel, Peter Murphy2, Simon van Gaal, Mototaka Suzuki, Jan Willem de Gee1; 1University of Amsterdam, 2National University of Ireland, Maynooth
Presenter: Lars Kopel
Bayesian theory prescribes that highly surprising outcomes should prompt rapid belief updating. Here we developed a novel passive belief updating protocol, suitable for humans and potentially mice. We observed multiple pupil-based signatures of belief updating: (i) pupil response magnitude was enhanced for stimuli that were unexpected given the recent history, and (ii) the relationship between pupil response magnitude and Bayesian change-point probability (surprise), derived from a normative belief updating model, reflected the participant’s belief about the volatility of the environment. We conclude that phasic arousal supports belief updating when encountering unexpected incoming information.
Auditory stimuli extend the temporal window of visual integration by modulating alpha-band oscillations
Talk 5, 11:40 am – Mengting Xu1, Biao Han2, Qi Chen, Lu Shen; 1Universiteit Gent, 2South China Normal University
Presenter: Mengting Xu
In multisensory environments, how inputs from different sensory modalities interact to shape perception is not fully understood. In this study, we investigated how auditory stimuli influence the temporal dynamics of visual processing using electroencephalography (EEG). We found that the presence of auditory stimuli led to poststimulus alpha frequency degradation, which positively correlated with the prolonged temporal window of visual integration. This was accompanied by a diminished predictive role of prestimulus alpha frequency while enhancing the predictive role of prestimulus alpha phase in shaping perceptual outcomes. To probe the underlying mechanisms, we developed a computational model that successfully replicated the core findings and revealed that auditory input extends the temporal window of visual integration by resetting alpha oscillations in the visual cortex, leading to alpha frequency reduction and an altered perception of visual events.