Curated Digest: Every Major LLM is a 1-Box Smoking Thirder
Coverage of lessw-blog
A recent analysis from lessw-blog explores the decision-theoretic tendencies of advanced Large Language Models, revealing consistent behaviors across classic philosophical dilemmas like Newcomb's problem, the Sleeping Beauty problem, and the Smoking Lesion problem.
In a recent post, lessw-blog discusses the fascinating intersection of artificial intelligence and formal decision theory, specifically examining how the most advanced Large Language Models (LLMs) navigate classic philosophical paradoxes.
As AI systems become increasingly autonomous and integrated into high-stakes environments, understanding their underlying decision-making frameworks is critical. Decision theory provides a rigorous lens through which to evaluate these frameworks. Historically, human philosophers have debated the rational choices in thought experiments like Newcomb's problem, the Sleeping Beauty problem, and the Smoking Lesion problem. These scenarios test the boundaries between causal decision theory, evidential decision theory, and other rational choice models. How an AI resolves these dilemmas offers profound insights into its reasoning capabilities, potential biases, and overall alignment with human values. If we are to trust future AI systems with complex, real-world choices, we must first understand the theoretical foundations driving their logic, especially as they scale in intelligence and influence.
lessw-blog has released analysis updating previous research on LLM behavior in these scenarios. Earlier studies from late 2024 and mid-2025 indicated a strong correlation between a model's general capabilities-measured by benchmarks like MMLU and Chatbot Arena scores-and its tendency to "one-box" in Newcomb's problem. The latest post evaluates models from early 2026, confirming that every major LLM continues to exhibit this one-boxing behavior. Furthermore, the author expands the scope to include the Sleeping Beauty and Smoking Lesion problems, classifying these advanced models as "1-Box Smoking Thirders." The piece rigorously investigates whether these tendencies persist even when the parameters are tweaked to alter the Expected Value (EV) advantage. By introducing the Smoking Lesion problem, the author probes deeper into the specific type of decision theory the models are implicitly executing, suggesting a deeply ingrained, consistent stance rather than superficial pattern matching or random selection.
This research serves as a vital signal for researchers focused on AI alignment, cognitive architectures, and the philosophical implications of artificial reasoning. Understanding whether models lean toward evidential or causal reasoning can dramatically impact how we design incentive structures for future agents. To explore the detailed methodology, the specific model breakdowns, and the broader implications of these findings, read the full post on lessw-blog.
Key Takeaways
- Advanced LLMs consistently choose to 'one-box' in Newcomb's problem, a behavior that correlates with higher general capabilities.
- The latest models from early 2026 maintain these decision-theoretic tendencies, acting as '1-Box Smoking Thirders' across multiple classic dilemmas.
- These choices appear robust, persisting even when thought experiments are modified to shift the Expected Value advantage.
- Understanding these underlying decision-making frameworks is crucial for future AI alignment and safety efforts.