Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers
Poster Session B: Wednesday, August 13, 1:00 – 4:00 pm, de Brug & E‑Hall
Evaluating view-invariant place recognition in humans and machines
Nathan Kong1, tyler bonnen2, Russell Epstein1; 1University of Pennsylvania, University of Pennsylvania, 2Electrical Engineering & Computer Science Department, University of California, Berkeley
Presenter: Nathan Kong
We are able to perceive spatial structure in the world around us. This ability supports a range of downstream behaviours---from navigation to memory retrieval---and is thought to rely on a network of 'scene-selective' cortical structures. Feedforward deep learning models are often thought to provide a suitable approximation of these perceptual abilities. This human-model correspondence, however, has largely been evaluated in classification tasks. Here we develop a novel behavioural assay which reveals a profound gap between these vision models and human abilities. We collect a corpus of naturalistic scenes (panorama captures from Google maps) and format these environments into 'oddity' tasks: participants are presented with two different viewpoints from one location (A), alongside an image from a different location (B), and must identify the odd-one-out (B). Critically, we manipulate the angular difference between views and perceptual similarity between environments. Through a series of experiments, we find that humans substantially outperform models on this benchmark, and that human reaction times scale linearly with task difficulty. These data highlight the temporal dynamics of place recognition, challenging common assumptions about the feedforward underpinnings of this foundational human ability.
Topic Area: Memory, Spatial Cognition & Skill Learning
Extended Abstract: Full Text PDF