Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers
Poster Session C: Friday, August 15, 2:00 – 5:00 pm, de Brug & E‑Hall
Multimodal Human Perception of Object Dimensions: Evidence from Deep Neural Networks And Large Language Models
Florian Burger1, Genevieve Quek, Manuel Varlet, Tijl Grootswagers; 1University of Western Sydney
Presenter: Florian Burger
Human object recognition relies on both perceptual and semantic dimensions. Here, we examined how deep neural networks (DNNs) and large language models (LLMs) capture and integrate human-derived dimensions of object similarity. We extracted layer activations from CORnet-S and obtained BERT embeddings for 1853 images from the THINGS dataset. We used support vector regression (SVR) to quantify explained variance in human-derived dimensions. Results showed that multimodal integration improved predictions in early visual processing but offers limited additional benefits at later stages, suggesting that deep perceptual processing already encodes meaningful object representations.
Topic Area: Object Recognition & Visual Attention
Extended Abstract: Full Text PDF