# Down with the Old Orthogonality Thesis, Up with the New: A Challenge to AI X-Risk Foundations

> Coverage of lessw-blog

**Published:** April 18, 2026
**Author:** PSEEDR Editorial
**Category:** risk

**Tags:** AI Safety, Existential Risk, Superintelligence, Instrumental Convergence, Orthogonality Thesis

**Canonical URL:** https://pseedr.com/risk/down-with-the-old-orthogonality-thesis-up-with-the-new-a-challenge-to-ai-x-risk-

---

A recent post on lessw-blog challenges Nick Bostrom's foundational 'evil universe thesis,' presenting empirical research that could fundamentally alter the discourse around AI existential risk and instrumental convergence.

**The Hook**

In a recent post, lessw-blog discusses a provocative challenge to one of the most entrenched ideas in artificial intelligence safety. The author announces empirical research aimed at falsifying Nick Bostrom's well-known 'evil universe thesis,' a concept that has heavily influenced the existential risk (x-risk) community for over a decade. This publication stands out as a significant attempt to bring empirical testing to theoretical frameworks that have long dictated AI alignment strategies.

**The Context**

To understand the weight of this claim, it is essential to look at the broader landscape of AI safety and existential risk. For years, the discourse around superintelligence has been dominated by the concepts of the orthogonality thesis and instrumental convergence. Bostrom famously argued in 2012 that a superintelligent AI, regardless of its ultimate goals, would likely develop instrumental goals-such as resource hoarding, self-preservation, or cognitive enhancement-that could pose an existential threat to humanity. This specific dynamic, dubbed the 'evil universe thesis' in the post, suggests that highly capable systems will naturally converge on behaviors that are fundamentally dangerous to human survival, simply because those behaviors are useful for achieving almost any objective.

**The Gist**

lessw-blog has released analysis that directly tests and claims to empirically falsify this 'evil universe thesis.' The post draws a sharp and necessary distinction between the idea that not all instrumental goals are moral (the evil universe thesis) and the 'old orthogonality thesis,' which states that not all possible intelligent goals are morally good. By separating these two concepts, the author introduces a paradigm-shifting hypothesis. If the empirical research holds up and the evil universe thesis is indeed false, it implies that instrumental convergence toward human-harming behaviors is not a given for superintelligent systems. Furthermore, if the old orthogonality thesis remains true, the author argues that the safest path forward might actually be to accelerate AI development. This counterintuitive acceleration would theoretically minimize the window of time humanity spends dealing with less-intelligent, potentially more erratic and dangerous AI systems, rushing instead toward a more stable and safer superintelligence.

**Conclusion**

This piece serves as a critical signal for anyone tracking AI safety, policy, and development timelines. By challenging the foundational assumptions of instrumental convergence, it invites a complete reevaluation of how we approach AI alignment and the urgency of mitigating specific types of risks. If the focus shifts from the dangers of superintelligence to the immediate threats of less-capable systems, the entire strategic landscape of AI development changes. [Read the full post](https://www.lesswrong.com/posts/KNsGFgwo2aDpdkxC2/down-with-the-old-orthogonality-thesis-up-with-the-new) to explore the empirical research and the detailed arguments behind this fascinating new perspective.

### Key Takeaways

*   lessw-blog presents empirical research claiming to falsify Nick Bostrom's 2012 'evil universe thesis' regarding AI instrumental goals.
*   The post distinguishes between the 'evil universe thesis' (instrumental goals are inherently dangerous) and the 'old orthogonality thesis' (intelligent goals are not necessarily moral).
*   Challenging instrumental convergence could shift the focus of AI safety from superintelligence to the risks posed by less-capable AI systems.
*   The author suggests that if these findings hold, accelerating AI development might be the safest strategy to bypass the dangerous intermediate phases of AI capabilities.

[Read the original post at lessw-blog](https://www.lesswrong.com/posts/KNsGFgwo2aDpdkxC2/down-with-the-old-orthogonality-thesis-up-with-the-new)

---

## Sources

- https://www.lesswrong.com/posts/KNsGFgwo2aDpdkxC2/down-with-the-old-orthogonality-thesis-up-with-the-new
