Contributed Talk Sessions | Poster Sessions | All Posters | Search Papers
Poster Session A: Tuesday, August 12, 1:30 – 4:30 pm, de Brug & E‑Hall
Metric-Learning Encoding Models Identify Processing Profiles of Linguistic Features in BERT’s Representations
Louis Jalouzot1, Robin Sobczyk2, Bastien Lhopitallier3, Jeanne Salle4, Nur Lan5, Emmanuel Chemla6, Yair Lakretz7; 1Ecole Normale Supérieure de Lyon, 2Ecole Normale Supérieure – PSL, 3Ecole Normale Superieure, 4MATS, 5École Normale Supérieure, 6Earth Species Project, 7Ecole Normale Supérieure de Paris
Presenter: Robin Sobczyk
We introduce Metric-Learning Encoding Models (MLEMs), a new framework to learn a feature-based metric explaining the geometry of neural representations. Applying MLEMs to BERT, we track various linguistic features (e.g., tense, subject number) and find distinct importance profiles across layers. For a given layer, feature importance ranking corresponds to a hierarchical geometry of representations. A univariate variant of our model reveals remarkable spontaneous disentanglement: in all layers, distinct neuron groups specialize in encoding single, specific linguistic features. MLEMs are more robust than popular decoding methods, offering a powerful tool for analyzing representations in artificial and biological neural systems.
Topic Area: Language & Communication
Extended Abstract: Full Text PDF