Poster Presentation

Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers

Poster C35 in Poster Session C: Friday, August 15, 2:00 – 5:00 pm, de Brug & E‑Hall

Encoding Brain Regions with Sentiment-Relevant Circuits in LLMs

Nursulu Sagimbayeva¹, Dota Tianai Dong²; ¹Universität des Saarlandes, ²Max Planck Institute for Psycholinguistics

Presenter: Nursulu Sagimbayeva

Large language models (LLMs) generate representations that effectively predict brain responses to natural language, yet the specific circuits within LLMs that drive this alignment remain largely unexplored. We here apply techniques from mechanistic interpretability (MI) to identify LLM circuits (i.e., attention heads) causally relevant to sentiment processing and assess their impact on LLM–brain alignment. Our results show that removing sentiment-related attention heads leads to a greater decrease in alignment with language-processing brain regions compared to random head removal, although this difference does not reach statistical significance. Ongoing work aims to further improve LLM circuit identification in naturalistic settings, enabling more precise mapping of circuits to plausible brain mechanisms and ultimately providing deeper insights into LLM–brain alignment.

Topic Area: Language & Communication

Extended Abstract: Full Text PDF