Poster Presentation

Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers

Poster Session C: Friday, August 15, 2:00 – 5:00 pm, de Brug & E‑Hall

Quantifying infants’ everyday experiences with objects in a large corpus of egocentric videos

Jane Yang¹, Tarun Sepuri¹, Alvin Wei Ming Tan², Michael Frank², Bria Lorelle Long¹; ¹University of California, San Diego, ²Stanford University

Presenter: Jane Yang

While modern vision-language models are typically trained on millions of curated photographs, infants learn visual categories and the words that refer to them from very different training data. Here, we investigate which objects infants actually encounter in their everyday environments, and how often they encounter them. We use a large corpus of egocentric videos taken from the infant perspective (N = 868 hours, N = 31 participants), applying and validating a recent object detection model (YOLOE) to detect a set of categories that are frequently named in children’s early vocabulary. We find that infants’ visual experience is dominated by a small set of objects, with differences in individual children’s home environments driving variability. We also find that young children tend to learn words earlier for more frequently encountered categories. These results suggest that visual experience scaffolds young children’s early category and language learning and highlight that ecologically valid computational models of category learning must be able to accommodate skewed input distributions.

Topic Area: Object Recognition & Visual Attention

Extended Abstract: Full Text PDF