Poster Presentation

Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers

Poster C139 in Poster Session C: Friday, August 15, 2:00 – 5:00 pm, de Brug & E‑Hall

Multimodal Human Perception of Object Dimensions: Evidence from Deep Neural Networks And Large Language Models

Florian Burger¹, Genevieve Quek, Manuel Varlet, Tijl Grootswagers; ¹University of Western Sydney

Presenter: Florian Burger

Human object recognition relies on both perceptual and semantic dimensions. Here, we examined how deep neural networks (DNNs) and large language models (LLMs) capture and integrate human-derived dimensions of object similarity. We extracted layer activations from CORnet-S and obtained BERT embeddings for 1853 images from the THINGS dataset. We used support vector regression (SVR) to quantify explained variance in human-derived dimensions. Results showed that multimodal integration improved predictions in early visual processing but offers limited additional benefits at later stages, suggesting that deep perceptual processing already encodes meaningful object representations.

Topic Area: Object Recognition & Visual Attention

Extended Abstract: Full Text PDF