A Critical Retrospective on OpenPhil's AI Timelines and Risk Models

A recent analysis on LessWrong challenges the historical consensus of Open Philanthropy regarding the proximity of Artificial General Intelligence and the probability of existential ruin, arguing these views may have skewed critical funding decisions.

In a recent discussion on LessWrong, a contributor examines the historical influence of Open Philanthropy (OpenPhil) and the Effective Altruism (EA) community on the trajectory of AI safety research. The post, titled Contradict my take on OpenPhil's past AI beliefs, serves as a solicitation for counter-arguments against the author's critique that major philanthropic bodies relied on inaccurate forecasting models to determine resource allocation.

The Context: Timelines as Strategy
The strategic landscape of AI safety is heavily influenced by "timelines"—the estimated time until human-level AI is developed. For the past decade, these estimates have dictated whether capital flows toward immediate technical alignment (implying short timelines) or long-term governance and theoretical research (implying long timelines). As current AI capabilities accelerate, the community is revisiting the assumptions that guided funding during the formative years of the field.

The Core Argument
The author posits that OpenPhil and associated EA groups anchored their strategies on "bad" timelines and methodologically flawed risk assessments. Specifically, the post critiques the reliance on Ajeya Cotra's "Biological Timelines," which, as late as 2020, placed median AGI arrival roughly 30 years in the future. Furthermore, the author attacks the risk modeling found in Joe Carlsmith's report, "Is Power-Seeking AI an Existential Risk?". The critique suggests that Carlsmith's analysis utilized a "Multiple Stage Fallacy"—multiplying several conditional probabilities to arrive at a deceptively low (~5%) risk estimate.

The central claim is that these specific intellectual positions were not merely academic exercises but acted as gatekeepers for funding. The author suggests that organizations like the Machine Intelligence Research Institute (MIRI), which held shorter timelines and higher risk probabilities, were marginalized based on these "official" yet potentially invalid beliefs. The post argues that while internal dissent existed, the public-facing and funding-determinative views favored longer timelines, which may have delayed necessary urgency in safety engineering.

This post is significant for observers of the AI safety ecosystem as it attempts to audit the epistemology of the field's largest funders. It raises questions about how consensus is formed in high-uncertainty domains and the consequences of those beliefs on real-world research progress.

Read the full post on LessWrong

Key Takeaways

The author argues OpenPhil's historical funding was biased by 'bad' AI timelines that estimated AGI was decades away.
Critiques the 'Multiple Stage Fallacy' in risk reports, specifically Joe Carlsmith's ~5% ruin probability estimate.
Suggests that 'Biological Timelines' (median AGI ~2050) were used to justify withholding funding from short-timeline organizations like MIRI.
Highlights the tension between 'official' philanthropic views and internal leadership dissent regarding AI risks.
Serves as a 'steelman' request, asking the community to refute the claim that these beliefs controlled past resource allocation.

Read the original post at lessw-blog

Key Takeaways

Sources