Anthropic Solicits Critical Feedback as Opus 4.5 Struggles with 'Context Amnesia' and Coding Regressions

Head of Developer Relations Alex Albert confirms the company is investigating reports of memory loss and interface bugs following the November release.

· 3 min read · PSEEDR Editorial

Following the November 24, 2025 release of Claude Opus 4.5, Anthropic has entered an aggressive post-launch stabilization phase. Head of Developer Relations Alex Albert recently invited users to detail their "gripes," uncovering a pattern of reliability issues-ranging from "context amnesia" to date hallucinations-that threaten to undermine the utility of the company's most capable model to date.

Despite being positioned as Anthropic's state-of-the-art model for complex tasks and agents, Claude Opus 4.5 is facing significant scrutiny regarding its stability and integration into professional workflows. On December 11, 2025, Alex Albert, Anthropic's Head of Developer Relations, issued a public call for feedback on X (formerly Twitter), specifically asking for detailed accounts of user pain points. The resulting discourse highlights a critical tension in the current LLM landscape: the gap between raw benchmark performance and the practical reliability required for daily engineering work.

The 'Context Amnesia' Phenomenon

The most pervasive complaint centers on memory management. Users report that Opus 4.5 suffers from "context amnesia," frequently forgetting earlier parts of a conversation or losing the thread of complex instructions. This behavior represents a significant regression from previous iterations, such as Sonnet 3.5, which users often cite as more stable despite having lower theoretical reasoning caps.

Compounding this issue are hallucinations regarding the current date. Multiple reports confirm the model often insists the year is 2024, a fundamental error that disrupts time-sensitive queries and scheduling tasks. These errors suggest that while the model's reasoning engine has been upgraded, its grounding in the immediate temporal context remains brittle.

UI Failures and the 'Blank Screen' Bug

The feedback loop has also exposed severe friction in the user interface. Prominent tech figures, including Deedy, have highlighted a critical failure mode where hitting context limits results in a blank interface, causing the user's input to be irretrievably lost. This lack of graceful degradation forces users to constantly backup their prompts externally, a workflow friction that undermines the tool's utility as a reliable assistant.

Furthermore, users expressed frustration over the inability to switch between models (e.g., from Opus to Sonnet) within a single active session. This limitation prevents developers from using the expensive Opus model for complex reasoning and then downgrading to the faster, cheaper Sonnet for routine text generation, forcing a disjointed workflow across multiple chat windows.

Coding Intelligence: Smart but Unwise

While Opus 4.5 excels at generating code from scratch, its ability to integrate into existing software architectures has drawn criticism. Developers note that the model often ignores established abstractions within a codebase, effectively "reinventing the wheel" rather than utilizing existing helper functions or classes. This tendency leads to redundant work and code bloat, requiring significant human intervention to refactor the AI's output.

Additionally, the model's handling of merge conflicts is described as lacking strategic intelligence. Rather than understanding the intent behind conflicting code blocks, the model sometimes applies brute-force resolutions that break logic, reinforcing the user sentiment that Opus 4.5 occasionally acts as a "smart pseudo-expert"-confident but operationally hazardous.

Security Guardrails and Feature Wishlists

The feedback session also touched on the delicate balance of safety alignment. Security researchers report that the model's refusal triggers are overly aggressive, often blocking the analysis of malware samples even in clearly defined, legitimate research contexts. This limits the model's adoption in cybersecurity operations, a key vertical for high-intelligence LLMs.

Looking forward, the community has coalesced around specific feature requests to mitigate these issues. There is strong demand for the automatic generation and maintenance of CLAUDE.md files-context documents that help the model retain project-specific knowledge. Users are also calling for "asynchronous progressive compression" to handle long contexts more efficiently without the current "uncontrollable" loss of detail.

Conclusion

The release of Opus 4.5 illustrates the diminishing returns of raw model scaling without commensurate improvements in robustness. While Anthropic has confirmed awareness of the "confused" behavior and is actively working on patches, the current sentiment suggests that for many professionals, stability and context retention are currently valued higher than marginal gains in reasoning capability. The immediate challenge for Anthropic is not just fixing bugs, but proving that their flagship model can be a trustworthy partner rather than an erratic savant.

Key Takeaways

Sources