Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers

Poster Session C: Friday, August 15, 2:00 – 5:00 pm, de Brug & E‑Hall

Multimodal Human Perception of Object Dimensions: Evidence from Deep Neural Networks And Large Language Models

Florian Burger1, Genevieve Quek, Manuel Varlet, Tijl Grootswagers; 1University of Western Sydney

Presenter: Florian Burger

Human object recognition relies on both perceptual and semantic dimensions. Here, we examined how deep neural networks (DNNs) and large language models (LLMs) capture and integrate human-derived dimensions of object similarity. We extracted layer activations from CORnet-S and obtained BERT embeddings for 1853 images from the THINGS dataset. We used support vector regression (SVR) to quantify explained variance in human-derived dimensions. Results showed that multimodal integration improved predictions in early visual processing but offers limited additional benefits at later stages, suggesting that deep perceptual processing already encodes meaningful object representations.

Topic Area: Object Recognition & Visual Attention

Extended Abstract: Full Text PDF