Poster Presentation

Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers

Poster Session A: Tuesday, August 12, 1:30 – 4:30 pm, de Brug & E‑Hall

Can Scene Graph Properties Explain the Neural Encoding Performance of Vision Transformers?

Helena Balabin¹, Rik Vandenberghe, Marie-Francine Moens²; ¹KU Leuven, ²KU Leuven, KU Leuven

Presenter: Helena Balabin

Neural encoding models allow for the exploration of hypotheses about cognitive processes by linking brain activations to representations derived from large language or image models. However, such representations often remain poorly understood, limiting the interpretability of neural encoding models. Therefore, we set out to examine the effect of scene graph properties on image model representations and neural encoding performances to functional magnetic resonance imaging (fMRI) data from the Natural Scenes Dataset (NSD). Specifically, we used the overlap between the NSD and the Visual Genome to characterize each image using the number of relationships, objects and depth of the accompanying scene graph annotations. We found that relationships and depth measures could be decoded more accurately both from fMRI activations and from image embeddings compared to objects, aligning with an afforance-based scene perception approach.

Topic Area: Object Recognition & Visual Attention

Extended Abstract: Full Text PDF