Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers

Poster Session A: Tuesday, August 12, 1:30 – 4:30 pm, de Brug & E‑Hall

Human-like compositional visual inference through neural diffusion on syntax trees

Sylvia Blackmore1, Sarah Feng, Steve Chang, Ilker Yildirim1; 1Yale University

Presenter: Sylvia Blackmore

Compositionality—the ability to decompose experiences into constituent parts and flexibly recombine them—is fundamental to human intelligence. Despite the vast combinatorial space created by even basic elements, humans efficiently navigate potential configurations during visual inference. We present a neuro-symbolic approach framing visual compositional inference as inverse graphics through guided program synthesis, implemented as neural diffusion on syntax trees. Our model represents images as programs, using a conditional neural network and value model to enable efficient beam-search through program space. Validated against human behavioral data, the model achieved human-like performance across trial types. This framework provides a computational account of visual inference as search through compositional state space.

Topic Area: Visual Processing & Computational Vision

Extended Abstract: Full Text PDF