Quantifying Existential Risk: A Novel Framework for Cost-Effectiveness

A recent post on lessw-blog introduces a quantitative framework for evaluating the cost-effectiveness of existential risk interventions, proposing a universal future-improvement unit to bring moneyball rigor to philanthropic grantmaking.

In a recent post, lessw-blog discusses a novel quantitative framework designed to evaluate the cost-effectiveness of philanthropic interventions, particularly those focused on existential risks such as artificial intelligence takeover.

As philanthropic efforts increasingly target long-term, high-stakes domains like AI safety and biosecurity, the challenge of measuring impact becomes exponentially harder. Traditional grantmaking often struggles to quantify abstract, probabilistic outcomes, leaving a significant gap in how resources are prioritized. This topic is critical because effective risk mitigation relies heavily on allocating capital where it can achieve the highest marginal impact. lessw-blog's post explores these dynamics, suggesting that current grantmaking is only partially competent at moneyball-the practice of rigorously quantifying impact to find undervalued opportunities. The author argues that there is a substantial alpha opportunity in putting accurate numbers on complex, long-term outcomes to improve decision-making.

To address this measurement gap, the author proposes a universal unit of goodness called 1% future-improvement. This unit is anchored on a theoretical value scale where the expected value (EV) of the multiverse is set at 100, and the EV if the Sun were to go supernova immediately is set at 0. A 1% future-improvement represents a shift from 100 to 101 on this scale, providing a baseline to compare vastly different types of interventions.

Applying this framework to the realm of artificial intelligence, the author calculates that decreasing the probability of an AI takeover by a single percentage point (for example, reducing the risk from 40% to 39%) is worth 1.7% future-improvement. Furthermore, magically eliminating the risk of AI takeover entirely is valued at a massive 70% future-improvement. To make this actionable for funders, the proposed default unit for evaluating financial cost-effectiveness is 1% future-improvement per $5 billion. This provides a standardized metric to compare diverse donation opportunities, allowing grantmakers to convert various strategic desiderata into a single, comparable number.

While the precise definitions of the multiverse's expected value and the specific assumptions underlying the $5 billion benchmark require further unpacking, the core thesis represents a significant step toward rigorous impact measurement. For funders, researchers, and strategists operating in the AI safety and existential risk space, this framework offers a provocative and potentially highly useful tool for resource allocation. By attempting to standardize the measurement of abstract future improvements, the author provides a fresh perspective on how we might prioritize global interventions.

We highly recommend reviewing the complete analysis to understand the mathematical models and assumptions driving these calculations. Read the full post to explore the mechanics of this proposed cost-effectiveness unit.

Key Takeaways

Current grantmaking in existential risk lacks rigorous moneyball quantification, presenting an opportunity for better impact measurement.
The author introduces 1% future-improvement as a universal unit of goodness, based on a scale tied to the expected value of the multiverse.
Decreasing the probability of an AI takeover by 1% is calculated to be worth 1.7% future-improvement.
The framework proposes a default cost-effectiveness benchmark of 1% future-improvement per $5 billion to standardize diverse donation opportunities.

Read the original post at lessw-blog

Key Takeaways

Sources