Dynamic Value Spaces: Moving Toward Pluralistic Alignment for Embodied AI

Date:

Modern Reinforcement Learning (RL) systems are typically trained to myopically optimize a single, fixed objective, despite the fact that human goals and ethical judgment are pluralistic, dynamic, and context dependent. This mismatch between the world being modeled and reality helps explain why even well-intentioned RL systems can behave in ways that appear ethically misaligned. In this talk, I introduce an optimal-transport-based RL modeling paradigm for dynamic value spaces, which represents a value system as a probability distribution over a pluralistic value simplex. I show how such a system can track an evolving “moral compass” that drives an agent’s decision process through interactions with a human in the loop, while balancing task efficacy and ethical alignment in its recommended actions. This value-aware RL framework provides transparent value representations which enables audits of an agent’s internal value system. I conclude by discussing the implications of this approach for educational and embodied AI systems.