Is Chain-of-Thought Just Fancy Retrieval? Mechanistic Analysis of Llama-2

A recent LessWrong post challenges the assumption that Chain-of-Thought prompting induces logical reasoning, proposing instead that it functions as a contextual stabilization mechanism.

In a recent post, lessw-blog presents a mechanistic analysis of Llama-2-7b, arguing that Chain-of-Thought (CoT) prompting may not function as a reasoning engine, but rather as a mechanism for "Contextual Stabilization" and "Associative Retrieval."

The AI community generally accepts that asking an LLM to "think step-by-step" improves performance on complex tasks. The standard interpretation is that this mimics human cognitive processes, breaking down problems into logical intermediate steps. However, the internal mechanics of why this works-and whether it constitutes true reasoning-remain a subject of intense debate.

The author posits that Llama-2-7b operates primarily as a "Bag-of-Facts" retriever. The intermediate tokens generated during CoT do not necessarily represent a causal chain of logic. Instead, they serve to stabilize the model's attention on relevant semantic entities. By generating related text, the model reinforces the context, effectively narrowing down the probability distribution for the final answer through pattern matching rather than logical deduction.

To support this, the post details experiments using the transformer_lens library, employing techniques like logit tracing, path patching, and entropy analysis. A critical observation highlighted is the "early knowledge" phenomenon, where the model's internal state reflects the correct answer long before the logical steps are completed. Furthermore, the analysis suggests the process is often permutation invariant; the specific order of the "reasoning" steps matters less than the presence of specific keywords. This behavior contradicts the sequential, dependency-based nature of formal logic, where Step B must follow Step A.

This analysis is significant because it questions the utility of current benchmarks in measuring "intelligence." If models are merely stabilizing retrieval paths rather than reasoning, our understanding of progress toward robust AI may be inflated by sophisticated pattern matching.

Read the full post

Key Takeaways

CoT functions as 'Contextual Stabilization' rather than logical deduction in Llama-2-7b.
Intermediate tokens help the model focus attention on semantic entities to collapse the answer distribution.
Experiments reveal 'early knowledge,' where the model identifies the answer before the reasoning chain is complete.
The process appears permutation invariant, suggesting the order of reasoning steps is less critical than the content.
The findings challenge the definition of 'reasoning' in current LLM benchmarks.

Read the original post at lessw-blog

Key Takeaways

Sources