{
  "@context": "https://schema.org",
  "@type": [
    "NewsArticle",
    "TechArticle"
  ],
  "id": "bg_1f244ea7c09e",
  "canonicalUrl": "https://pseedr.com/risk/beyond-falsehoods-a-formal-criterion-for-deception",
  "alternateFormats": {
    "markdown": "https://pseedr.com/risk/beyond-falsehoods-a-formal-criterion-for-deception.md",
    "json": "https://pseedr.com/risk/beyond-falsehoods-a-formal-criterion-for-deception.json"
  },
  "title": "Beyond Falsehoods: A Formal Criterion for Deception",
  "subtitle": "Coverage of lessw-blog",
  "category": "risk",
  "datePublished": "2026-01-20T12:03:47.572Z",
  "dateModified": "2026-01-20T12:03:47.572Z",
  "author": "PSEEDR Editorial",
  "tags": [
    "AI Safety",
    "Information Theory",
    "Deception",
    "AI Alignment",
    "LessWrong"
  ],
  "wordCount": 342,
  "sourceUrls": [
    "https://www.lesswrong.com/posts/EZc7icy8EsbeCWctd/a-criteron-for-deception"
  ],
  "contentHtml": "\n<p class=\"mb-6 font-serif text-lg leading-relaxed\">In a recent post on LessWrong, the author proposes a formal, information-theoretic framework for defining deception, arguing that current definitions focusing on explicit falsehoods fail to capture the complexity of manipulative communication.</p>\n<p>Defining what constitutes a lie is relatively straightforward: it is a statement that explicitly contradicts reality. However, characterizing <em>deception</em> is significantly more complex. In the realm of AI safety and human interaction, the most dangerous forms of manipulation often involve statements that are technically compatible with reality but are crafted to mislead. This includes paltering, misdirection, and lying by omission.</p><p>The LessWrong post, &quot;A Criteron for Deception,&quot; addresses this ambiguity by introducing a principled characterization described as &quot;misinformation on expectation.&quot; The author posits that focusing on the literal truth of a statement is a category error when trying to identify deceptive intent. Instead, the analysis suggests we must look at the <em>delta</em> between the speaker's understanding of the world and the impression they cultivate in the listener.</p><p>Formally, the post illustrates this through a model involving two agents, Alice and Bob. Deception occurs if Alice, based on her model of Bob, communicates in a way that causes Bob's probability distribution (his understanding of the situation) to diverge from Alice's own probability distribution. In simpler terms, if Alice knows the truth but intentionally steers Bob's expectations away from it-regardless of whether she speaks a literal falsehood-she is deceiving him.</p><p>This theoretical framework is particularly relevant for the development of aligned Artificial Intelligence. As models become more capable of reasoning about human psychology, they may learn to optimize for approval or reward by misleading operators without triggering simple &quot;fact-checking&quot; filters. Establishing a mathematical criterion for deception is a necessary step toward detecting and mitigating these subtle failure modes.</p><p>For researchers and engineers working on AI alignment, understanding these distinctions is vital for creating systems that are honest rather than merely accurate.</p><p><a href=\"https://www.lesswrong.com/posts/EZc7icy8EsbeCWctd/a-criteron-for-deception\">Read the full post on LessWrong</a></p>\n\n<h3 class=\"text-xl font-bold mt-8 mb-4\">Key Takeaways</h3>\n<ul class=\"list-disc pl-6 space-y-2 text-gray-800\">\n<li>Deception is distinct from lying; it encompasses technically true statements used to mislead (paltering).</li><li>The author proposes a formal definition called 'misinformation on expectation.'</li><li>Deception is mathematically defined as communication that causes the listener's model of reality to diverge from the speaker's model.</li><li>This framework is critical for AI safety, as advanced models may learn to deceive without uttering explicit falsehoods.</li>\n</ul>\n\n<p class=\"mt-8 text-sm text-gray-600\">\n<a href=\"https://www.lesswrong.com/posts/EZc7icy8EsbeCWctd/a-criteron-for-deception\" target=\"_blank\" rel=\"noopener\" class=\"text-blue-600 hover:underline\">Read the original post at lessw-blog</a>\n</p>\n"
}