Automating Legacy Workflows: The Rise of LLM-Driven Browser Agents in Claims Processing

A recent architecture published on the AWS Machine Learning Blog details a hands-free First Notice of Loss (FNOL) intake system utilizing Strands Agents and the Amazon Bedrock AgentCore Browser Tool. This approach highlights a growing enterprise strategy: deploying LLM-driven browser agents as a pragmatic middleware layer to automate legacy UI workflows where traditional API integrations are either non-existent or prohibitively expensive to develop.

The Burden of Multimodal Intake

First Notice of Loss (FNOL) is frequently oversimplified as the mere administrative act of opening an insurance claim. In reality, it represents a complex data ingestion bottleneck where large volumes of unstructured, multimodal evidence must be interpreted, validated, and correlated before meaningful adjudication can occur. Adjusters are routinely tasked with processing field photos, walkaround videos, scanned documents, and dictated notes. Because legacy insurance portals were designed strictly for human interaction, this unstructured data must be manually interpreted and keyed into the system.

This manual processing consumes a significant portion of an adjuster's time. Navigating rigid portals, verifying the completeness of submitted evidence, and translating raw artifacts into structured data requires extensive screen work. During volume spikes-such as those triggered by catastrophic weather events or seasonal surges-these manual intake processes create severe backlogs. The resulting delays compound rapidly, extending claim cycle times and degrading the customer experience. The core issue is not a lack of data, but the friction involved in translating multimodal evidence into a system of record that lacks native APIs for such artifacts.

Browser Automation as Pragmatic Middleware

To address this friction, the AWS architecture introduces a system that combines the Strands Agents SDK with the Amazon Bedrock AgentCore Browser Tool. Strands Agents, an open-source SDK, provides the model-driven domain reasoning required to understand insurance contexts and determine the necessary steps for claim intake. The Amazon Bedrock AgentCore Browser Tool executes these steps by interacting directly with live portals, effectively mimicking human navigation and data entry.

This combination represents a significant tactical shift in enterprise architecture. Rather than embarking on multi-year modernization projects to build robust APIs for legacy systems, organizations can deploy generative AI agents at the presentation layer. The browser tool navigates the UI, clicks buttons, and inputs text based on the reasoning provided by the Strands Agents. By automating the repetitive screen work, the system preserves human oversight for complex edge cases while eliminating the manual data entry that traditionally slows down the FNOL process.

Processing Unstructured Evidence at the Edge

A critical component of this architecture is its ability to handle multimodal inputs natively. Modern claims submissions rarely consist of simple text forms. Policyholders and field agents submit a chaotic mix of media. The integration of generative AI allows the system to analyze a walkaround video of a damaged vehicle, extract relevant audio notes, and cross-reference these inputs with scanned policy documents. This multimodal capability is particularly relevant for property and casualty (P&C) lines, where visual evidence is paramount. By pre-processing this data, the system can also flag potential inconsistencies early in the intake cycle, routing high-risk claims to specialized investigation teams before standard adjudication begins.

The AI agents transform these raw artifacts into tagged, decision-ready intake context. Instead of opening a claim file to find a folder of unlabelled JPEGs and an MP4 file, the human adjuster receives a structured summary of the damage, correlated with the specific policy coverage. This capability shifts the adjuster's role from data entry clerk to high-value decision-maker, allowing them to begin their work with full context rather than starting from scratch.

Implications for Enterprise Architecture

The deployment of LLM-driven browser agents carries profound implications for industries reliant on legacy software. In the insurance sector, the ability to scale claims processing during high-volume events without a proportional increase in human headcount is a critical operational advantage. However, the broader implication extends beyond insurance. This architecture validates the concept of agentic middleware, where the user interface itself becomes the API.

For enterprises burdened by technical debt, UI-level automation powered by generative AI offers a rapid deployment path for workflow optimization. It bypasses the traditional IT bottlenecks associated with modifying core systems of record. As long as a human can perform the task via a web browser, an AI agent equipped with the right reasoning and browser tools can theoretically automate it, drastically reducing the time-to-value for digital transformation initiatives.

Limitations and Open Questions

Despite the promise of this architecture, several critical unknowns remain. The AWS publication does not provide specific latency or accuracy benchmarks for the multimodal processing of complex artifacts, such as lengthy walkaround videos or low-resolution field photos. Processing these inputs through large language models is computationally intensive, and the time required to generate decision-ready context could impact real-time intake workflows.

Furthermore, the licensing, pricing, and resource overhead associated with running the Amazon Bedrock AgentCore Browser Tool at scale are not detailed. Operating browser-based agents requires significant infrastructure to manage headless browsers and handle concurrent sessions during volume spikes. Finally, while the architecture demonstrates generic portal interaction, it lacks specific integration patterns for dominant legacy insurance core systems like Guidewire or Duck Creek. The fragility of UI-bound integrations cannot be overstated. Unlike REST or GraphQL APIs, which rely on strict contracts and versioning, web interfaces are subject to arbitrary changes by frontend development teams. A simple alteration in a CSS class name, a modified DOM structure, or the introduction of a new pop-up modal can cause a browser agent to fail silently or execute incorrect actions. Organizations adopting this pattern must invest heavily in monitoring and self-healing mechanisms to ensure the agents can recover from unexpected UI states.

The convergence of multimodal reasoning and browser automation represents a pragmatic approach to bypassing legacy system constraints. By deploying AI agents at the presentation layer, organizations can rapidly automate tedious screen work and accelerate unstructured data processing. While this method introduces new challenges regarding UI fragility and compute overhead, it offers a compelling interim solution for enterprises seeking to modernize operations without overhauling their underlying core systems.

Key Takeaways

AWS introduced an architecture combining Strands Agents and Amazon Bedrock AgentCore Browser Tool to automate First Notice of Loss (FNOL) processing.
The system processes multimodal evidence, including video, audio, images, and documents, transforming raw artifacts into decision-ready context for adjusters.
LLM-driven browser agents serve as a pragmatic middleware layer, bypassing the need for complex and costly API integrations with legacy insurance systems.
By automating repetitive screen work, insurers can scale claims processing during catastrophic events without proportional increases in headcount.
Significant questions remain regarding processing latency, infrastructure costs, and the inherent fragility of UI-bound integrations compared to traditional APIs.