Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers
Poster Session C: Friday, August 15, 2:00 – 5:00 pm, de Brug & E‑Hall
Emergent Reciprocity Through Temporal Credit Assignment in Reinforcement Learning Agents
Le Thuy Duong Nguyen1, Dane Malenfant2, Blake Aaron Richards3; 1Mila - Quebec Artificial Intelligence Institute, 2McGill University, McGill University, 3Google
Presenter: Le Thuy Duong Nguyen
Humans and animals have complex social and cultural systems that can span large distances and times. In North America, Plains Indigenous nations practiced reciprocity through Manitokan, effigies placed at fixed locations, where leaving surplus goods was seen as cooperative acts of resource sharing or caching. In multi-agent reinforcement learning (MARL), reciprocity has largely been defined as an emergent property through tit-for-tat policies or reputation scores that establish social norms. These perspectives fail to consider the temporal structure of rewards or the criticality of certain actions necessary for success. We present a novel MARL environment to investigate the emergence of reciprocal prosocial behaviours in reinforcement learning agents. Baseline experiments show that agents consistently converged to suboptimal policies favoring individual resource maximization, despite the potential for improved collective outcomes. These findings highlight a critical gap in existing MARL methods, suggesting the need for new algorithms capable of supporting temporal credit assignment in artificial agents.
Topic Area: Reward, Value & Social Decision Making
Extended Abstract: Full Text PDF