Revisiting the Foundation: Is 'Life' Actually a Terminal Value?

In a recent post, lessw-blog challenges standard philosophical assumptions regarding terminal values, positing that existence itself is instrumental rather than fundamental.

In a recent post, lessw-blog discusses the philosophical hierarchy of values, specifically questioning whether commonly cited "terminal values"—such as Life and Consciousness—are truly foundational. The author argues that these concepts, often treated as ends in themselves in ethical frameworks (referencing Frankena), are actually instrumental. The post posits that all value systems ultimately "bottom out" at a single, true terminal value: happiness or flourishing.

The Context: Why Definitions Matter
This distinction is far more than a semantic debate; it is a critical component of AI safety and alignment theory. When designing autonomous systems, engineers must define objective functions (what the AI should strive to achieve). If an AI is programmed with "preserving life" as a terminal value without a deeper understanding of quality of life, the system could theoretically optimize for outcomes that maximize biological existence at the expense of well-being-potentially resulting in dystopian scenarios where suffering is prolonged simply to maintain the technical state of "being alive."

The Argument for Flourishing
The core of lessw-blog's argument is that existence is only valuable insofar as it enables a positive state of being. The author suggests that a predominantly painful life is not preferable to non-existence, implying that "Life" is a means to the end of "Flourishing," rather than an end itself. To illustrate this, the post uses the example of a mother sacrificing herself for her child. This scenario demonstrates that the terminal value is not the agent's own experience or continued existence, but rather a specific "world outcome"—a state of reality where the child flourishes, even if the agent is not there to witness it.

Implications for Alignment
This perspective shifts the target for AI alignment from preserving states (like consciousness) to optimizing for world outcomes that constitute flourishing. It suggests that safety protocols must be robust enough to distinguish between the appearance of a good outcome and the reality of a flourishing world, preventing systems from "wireheading" (hacking the reward signal) or preserving life in suboptimal conditions.

For researchers and engineers working on value loading, this post offers a necessary provocation: Are our current definitions of terminal values precise enough to prevent perverse instantiation?

Read the full post on LessWrong

Key Takeaways

Instrumental vs. Terminal: The author argues that 'Life' and 'Consciousness' are instrumental values used to achieve the true terminal value of flourishing.
The 'Rock Bottom' Value: All ethical considerations are posited to reduce ultimately to 'happiness/flourishing' or 'the most flourishing, happy world outcome'.
World States over Experiences: Using the example of self-sacrifice, the post illustrates that agents value the actual state of the world, not just their sensory experience of it.
AI Safety Implications: Misidentifying instrumental values as terminal values in AI systems could lead to optimization for undesirable states, such as misery-filled survival.

Read the original post at lessw-blog

Key Takeaways

Sources