Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers

Poster Session C: Friday, August 15, 2:00 – 5:00 pm, de Brug & E‑Hall

A multimodal encoding model for predicting human brain responses to complex naturalistic movies

Viacheslav Fokin1, Arefeh Sherafati2; 1Moscow State School 57, 2University of California, San Francisco

Presenter: Viacheslav Fokin

Accurately modeling human brain responses to complex, multimodal sensory inputs remains a core challenge in computational neuroscience. The Algonauts Project 2025 provides a large whole-brain fMRI dataset collected during naturalistic movie viewing, presenting the challenge of improving multimodal encoding models. Here, we propose a biologically informed encoding model that integrates visual, auditory, and language features, extracted using established deep learning and signal processing approaches, within predefined functional connectivity (FC) networks. By applying predictive modeling within an FC-based cortical mask, we achieve a 45.39% performance gain over the full cortex baseline. Our findings demonstrate the value of incorporating functional brain organization into encoding models and lays a foundation for future biologically grounded AI systems that integrate sensory information across domains.

Topic Area: Visual Processing & Computational Vision

Extended Abstract: Full Text PDF