Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers

Audition and Language

Contributed Talk Session: Thursday, August 14, 11:00 am – 12:00 pm, Room C1.03

A Two-Dimensional Space of Linguistic Representations Shared Across Individuals

Talk 1, 11:00 am – Greta Tuckute¹, Elizabeth J. Lee¹, Yongtian Ou², Evelina Fedorenko¹, Kendrick Kay²; ¹Massachusetts Institute of Technology, ²University of Minnesota - Twin Cities

Presenter: Greta Tuckute

Humans learn and use language in diverse ways, yet all typically developing individuals acquire at least one language and use it to communicate complex ideas. This fundamental ability raises a key question: Which dimensions of language processing are shared across brains, and how are these dimensions organized in the human cortex? To address these questions, we collected ultra-high-field (7T) fMRI data while eight participants listened to 200 linguistically diverse sentences. To identify main components of variance in the sentence-evoked brain responses, we performed data decomposition and systematically tested which components generalize across individuals. Only two shared components emerged robustly, together accounting for about 32% of the explainable variance. Analysis of linguistic feature preferences showed that the first component corresponds to processing difficulty, and the second—to meaning abstractness. Both components are spatially distributed across frontal and temporal areas associated with language processing but, surprisingly, also extended into the ventral visual cortex. These findings reveal a low-dimensional, spatially structured representational basis for language processing shared across humans.

Full Text PDF

Disentangling of Spoken Words and Talker Identity in Human Auditory Cortex

Talk 2, 11:10 am – Akhil Bandreddi¹, Dana Boebinger¹, David Skrill¹, Kirill V Nourski², Matthew Howard, Christopher Garcia, Thomas Wychowski, Webster Pilcher, Samuel Victor Norman-Haignere³; ¹University of Rochester, ²The University of Iowa, ³University of Rochester Medical Center

Presenter: Akhil Bandreddi

Complex natural sounds such as speech contain many different types of information, but recognizing these distinct information sources is computationally challenging because sounds with shared information differ widely in their acoustics. For example, variation across talkers makes it challenging to recognize the identity of a word, while variation in the acoustics of different words makes it challenging to recognize talker identity. How does the human auditory cortex disentangle word identity from talker identity such that each type of information can be coded invariant to acoustic variation all other information sources? To address this question, we measured neural responses to a diverse set of 338 words spoken by 32 different talkers using spatiotemporally precise intracranial recordings from the human auditory cortex. We developed a simple set of model-free experimental metrics for quantifying representational disentangling of word and talker identity, both within individual electrodes as well as across different dimensions of the neural population response. We observed individual electrodes that show a representation of words that is partially robust to acoustic variation in talker identity, but no electrodes or brain regions showed a robust representation of talker identity. However, at the population level, we observed distinct dimensions of the neural response that nearly exclusively reflected either words or talker identity, and were completely invariant to acoustic variation in the non-target dimension. These results suggest that while there is partial specialization for talker-robust word identity in localized brain regions, robust disentangling is accomplished at the population level with distinct representations of words and talker identity mapped to distinct dimensions of the neural code for speech.

Full Text PDF

Decomposition of uncertainty into dispersion and strength during speech processing

Talk 3, 11:20 am – Pierre Guilleminot¹, Benjamin Morillon; ¹Université d'Aix-Marseille

Presenter: Pierre Guilleminot

Speech comprehension relies on contextual predictions to minimize error between expected and incoming auditory input. This mechanism hinges on linguistic uncertainty, typically quantified using Shannon entropy, which captures the overall unpredictability of a given word based on prior context. However, Shannon entropy is only one member of the broader Rényi entropy family, which enables a distinction between uncertainty due to the strength of dominant predictions and that due to the dispersion across alternatives. Here, we investigated which entropy-based measure of uncertainty best reflects neural processing during speech listening. Using intracortical recordings from subjects listening to an audiobook, we computed the mutual information between neural responses and different uncertainty measures derived from a large language model. Our results reveal that, rather than Shannon entropy, the brain separately processes the strength and dispersion of predictions in distinct neural populations. This suggests a multidimensional representation of uncertainty. More generally, these findings highlight the need for a more refined definition of uncertainty in cognitive neuroscience.

Full Text PDF

Pupil-linked arousal tracks belief updating in a dynamic auditory enviroment

Talk 4, 11:30 am – Lars Kopel¹, Evi van Gastel, Peter Murphy², Simon van Gaal, Mototaka Suzuki, Jan Willem de Gee¹; ¹University of Amsterdam, ²National University of Ireland, Maynooth

Presenter: Lars Kopel

Bayesian theory prescribes that highly surprising outcomes should prompt rapid belief updating. Here we developed a novel passive belief updating protocol, suitable for humans and potentially mice. We observed multiple pupil-based signatures of belief updating: (i) pupil response magnitude was enhanced for stimuli that were unexpected given the recent history, and (ii) the relationship between pupil response magnitude and Bayesian change-point probability (surprise), derived from a normative belief updating model, reflected the participant’s belief about the volatility of the environment. We conclude that phasic arousal supports belief updating when encountering unexpected incoming information.

Full Text PDF

Auditory stimuli extend the temporal window of visual integration by modulating alpha-band oscillations

Talk 5, 11:40 am – Mengting Xu¹, Biao Han², Qi Chen, Lu Shen; ¹Universiteit Gent, ²South China Normal University

Presenter: Mengting Xu

In multisensory environments, how inputs from different sensory modalities interact to shape perception is not fully understood. In this study, we investigated how auditory stimuli influence the temporal dynamics of visual processing using electroencephalography (EEG). We found that the presence of auditory stimuli led to poststimulus alpha frequency degradation, which positively correlated with the prolonged temporal window of visual integration. This was accompanied by a diminished predictive role of prestimulus alpha frequency while enhancing the predictive role of prestimulus alpha phase in shaping perceptual outcomes. To probe the underlying mechanisms, we developed a computational model that successfully replicated the core findings and revealed that auditory input extends the temporal window of visual integration by resetting alpha oscillations in the visual cortex, leading to alpha frequency reduction and an altered perception of visual events.

Full Text PDF