Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers

Poster Session B: Wednesday, August 13, 1:00 – 4:00 pm, de Brug & E‑Hall

Modelling Multimodal Integration in Human Concept Processing with Vision-Language Models

Anna Bavaresco1, Marianne De Heer Kloots1, Sandro Pezzelle1, Raquel Fernández1; 1University of Amsterdam

Presenter: Raquel Fernández

Text representations from language models have proven remarkably predictive of human neural activity involved in language processing. However, the word representations learnt by language-only models may be limited in that they lack sensory information from other modalities. Here, we leverage recent AI advancements in multimodal modelling to investigate whether current pre-trained vision-language models (VLMs) yield concept representations that are more aligned with human brain activity than those obtained by models trained with language-only input. Our results reveal that VLM representations correlate more strongly than those by language-only models with activations in brain areas functionally related to language processing. Altogether, our study indicates that vision-language integration better captures the nature of human concepts.

Topic Area: Language & Communication

Extended Abstract: Full Text PDF