arXiv:2505.17024v1 Announce Type: new
Abstract: AI alignment is a field of research that aims to develop methods to ensure that agents always behave in a manner aligned with (i.e. consistently with) the goals and values of their human operators, no matter their level of capability. This paper proposes an affectivist approach to the alignment problem, re-framing the concepts of goals and values in terms of affective taxis, and explaining the emergence of affective valence by appealing to recent work in evolutionary-developmental and computational neuroscience. We review the state of the art and, building on this work, we propose a computational model of affect based on taxis navigation. We discuss evidence in a tractable model organism that our model reflects aspects of biological taxis navigation. We conclude with a discussion of the role of affective taxis in AI alignment.

Expert Commentary: The Affectivist Approach to AI Alignment

In the realm of artificial intelligence (AI) research, the concept of AI alignment has become a key area of focus in recent years. The goal of AI alignment is to ensure that autonomous agents, such as AI systems, consistently act in ways that are in line with the goals and values of their human creators. This is crucial for maintaining control and ensuring the safe and ethical use of AI technology.

This paper introduces an innovative approach to the AI alignment problem known as the affectivist approach. By reframing the concepts of goals and values in terms of affective taxis, the authors propose a new perspective on understanding how AI systems can be aligned with human intentions. Affective taxis refers to the inherent drive or motivation that guides an agent’s actions, much like the concept of emotional valence in human decision-making.

Multidisciplinary Insights

What sets the affectivist approach apart is its interdisciplinary nature, drawing on insights from evolutionary-developmental and computational neuroscience. By exploring how affective valence can emerge in AI systems through computational models based on taxis navigation, the authors shed light on the complex interplay between emotions, motivations, and decision-making processes in autonomous agents.

The incorporation of evidence from biological taxis navigation in model organisms further strengthens the validity of the proposed computational model of affect. This multi-disciplinary approach not only enriches our understanding of AI alignment but also opens up new avenues for research at the intersection of neuroscience, psychology, and artificial intelligence.

Implications for AI Alignment

By emphasizing the role of affective taxis in shaping the behavior of AI systems, this paper highlights the importance of integrating emotional intelligence and ethical considerations into the design and development of autonomous agents. Understanding how affective valence can be harnessed to align AI with human values is crucial for advancing the field of AI alignment and ensuring the responsible use of AI technology.

Overall, the affectivist approach presents a novel and promising framework for addressing the AI alignment problem, blending insights from multiple disciplines to tackle the complex challenge of aligning AI with human intentions. As research in this area continues to evolve, it is clear that a multi-disciplinary approach will be essential for shaping the future of artificial intelligence.

Read the original article