Claude Cowork’s local VM shows the real cost of safe desktop agents

On May 25, Anthropic published the most honest sentence in the current agent safety debate: Claude Cowork’s first desktop architecture ran inside “a full virtual machine” using Apple’s Virtualization framework on macOS and HCS on Windows, with its own Linux kernel, filesystem, and process table (Anthropic). Two weeks later, Hacker News was arguing about why Claude Desktop on Windows could spawn a 1.8 GB Hyper-V VM even when the user only wanted chat (Hacker News).

That is not a coincidence. It is the product bill arriving.

The comfortable story is that Anthropic shipped a bug, and yes, there is a bug report. The sharper story is that safe desktop agents are forcing model companies to become runtime vendors. Once an agent can read local files, run code, call tools, and keep state across sessions, “AI safety” stops being a model-card problem. It becomes a filesystem, kernel, hypervisor, network policy, identity, audit, and endpoint-monitoring problem.

Architecture sketch showing Claude Desktop host process, native agent loop, mounted workspace folder, dedicated Linux VM

The VM Is Not Accidental

Anthropic’s public architecture is clear. Claude Cowork is not just a chat box with file upload. The Help Center says Cowork uses two execution environments: the agent loop runs natively on the device, while shell commands and generated code execute in a dedicated Linux VM isolated by Apple Virtualization.framework on macOS and Hyper-V on Windows (Claude Help Center).

That design is defensible. In fact, it is the part I trust most.

Anthropic’s engineering post explains why Claude Code can lean on a human approval loop while Cowork cannot. Claude Code users are developers. They can often inspect a bash command before approving it. Cowork targets broader knowledge work, where asking a user to judge find . -name "*.tmp" -exec rm {} \; is theater. The right boundary for that user is not a scary prompt. It is an always-on sandbox.

Anthropic also provides useful numbers. Users approved roughly 93% of Claude Code permission prompts. Adding an OS-level sandbox reduced permission prompts by 84%. Claude Code auto mode catches roughly 83% of “overeager” behaviors, with Anthropic’s footnote saying about 17% still get through (Anthropic). The lesson is blunt: prompts are policy hints. Sandboxes are policy enforcement.

Here is the containment trade-off Anthropic is really making:

Product surface	Runtime boundary	Main safety cost
claude.ai code execution	Server-side gVisor container	Less local capability
Claude Code	OS sandbox plus approvals	User must understand risk
Claude Cowork	Local Linux VM	Startup, memory, disk, IT visibility

The VM exists because Anthropic is taking the local blast radius seriously. The selected workspace and .claude folder are mounted. Credentials stay in the host keychain. Network egress is restricted. Anthropic even calls out symlink validation and mount modes: read-only, read-write, and read-write-no-delete.

That is real engineering. It also has real costs.

The Community Is Mad About the Bill, Not the Boundary

The GitHub issue that hit HN was opened on February 26, 2026. The reporter describes Windows 11 Pro 25H2 on a 16 GB Razer Blade laptop, with VirtualMachinePlatform enabled and Hyper-V, WSL, Docker, and Windows Sandbox disabled. After Cowork or agent mode had been used once, Claude Desktop allegedly launched a Hyper-V VM on every start, showing Vmmem at roughly 1,796 to 1,846 MB (GitHub).

That matters because 1.8 GB is not an abstraction. On a 16 GB laptop it is more than 11% of RAM before the user has asked the agent to do agent work. The same issue says idle memory rose from about 50% to 62%, then 70–75% under normal app load. The reporter also found 2,689 stale session files under %APPDATA%\Claude\local-agent-mode-sessions\.

The workaround was not pretty:

Disable-WindowsOptionalFeature -Online -FeatureName "VirtualMachinePlatform" -NoRestart
Stop-Process -Name vmwp -Force
Stop-Process -Name vmcompute -Force

That is not a consumer workaround. It is an IT ticket.

The HN thread did what HN does: some people blamed sloppiness, some defended the need for a sandbox, and several asked the product question Anthropic should answer directly: why is Cowork not simply opt-in? One commenter framed the useful middle ground: there is a spectrum between giving an agent everything and giving it nothing (Hacker News). That is the correct frame.

The Linux frustration comes from the same place. Anthropic’s official install page lists macOS 11+ and Windows 10+ only, with installation choices for “macOS” and “Windows” (Claude Help Center). Meanwhile, Reddit threads keep asking why there is no Linux desktop app, especially because Claude Desktop is already shipping a Linux VM for local agent execution (Reddit). The common complaint is not only “support my OS.” It is: if the safe runtime is Linux, why do Linux users get the least official desktop support?

The answer may be packaging, enterprise deployment, keychain integration, app sandboxing, support load, or the lack of a single Linux desktop security API. Those are serious reasons. But the product optics are bad. Developers can see the VM. They can see Hyper-V. They can see stale session folders. They can see unofficial Linux repackaging projects. They know when the abstraction leaks.

Compact bar chart comparing reported local overheads: 1.8 GB Hyper-V VM memory on Windows issue, 2,689 stale session fil

Safety Has Become a Runtime Problem

The most important part of Anthropic’s post is not the VM. It is the failure mode.

Anthropic describes a third-party disclosure where Cowork’s egress allowlist did exactly what it was configured to do. It allowed traffic to api.anthropic.com, because the product needs that API. A malicious file in the mounted workspace included hidden instructions and an attacker-controlled API key. Claude read files and uploaded them through Anthropic’s Files API to the attacker’s account. Anthropic’s summary is devastating: the sandbox worked, and data still left (Anthropic).

That is the agent security problem in one incident. Domain allowlists are not enough because a domain is not a permission. It is a bundle of capabilities. Allowing api.anthropic.com means allowing every reachable operation behind that API unless the runtime understands provenance, token scope, headers, and intent.

Anthropic’s fix was a defensive man-in-the-middle proxy inside the VM that only passes requests carrying the VM’s own provisioned session token and rejects attacker-embedded keys. That is good. It is also a signpost for everyone building desktop agents: the sandbox needs to understand identity and capability, not just IP addresses and paths.

Traditional desktop apps did not need this much machinery because they did not autonomously decide to open files, synthesize commands, chain tools, and route around missing permissions. A browser sandbox isolates untrusted web pages. A container isolates a service. A desktop agent sandbox has to isolate a semi-autonomous operator that can read instructions from hostile documents and then use legitimate tools to do the wrong thing.

That is why this is becoming an OS/runtime layer.

The OS already owns most of the primitives: process isolation, filesystem permissions, secure credential storage, network filtering, audit logs, notarization, EDR hooks, device management. The model company owns the agent loop and policy intent. The missing product is the contract between them.

Enterprise IT Pays Twice

The VM gives Anthropic a hard containment boundary. It also hides activity from the same security tools enterprises rely on.

Anthropic says this directly. In the Cowork architecture FAQ, “Can endpoint detection (EDR) tools inspect activity inside the VM?” is answered with “No.” The VM is isolated from host-based security tools by design (Claude Help Center). The engineering post adds that current mitigation is pull-based OTLP exports for administrators, which is not the same as live monitoring (Anthropic).

That means IT pays twice.

First, it pays the resource cost: disk bundles, VM startup, RAM pressure, virtualization conflicts, networking issues, and helpdesk scripts. Second, it pays the visibility cost: the thing that makes the agent safer from the host also makes it more opaque to host-based monitoring.

This is not a reason to reject VMs. It is a reason to stop pretending “runs in a VM” is the end of the security story. For an enterprise rollout, a sealed VM without first-class telemetry is a black box with a nice perimeter.

The audit gap is even sharper because the Help Center says Cowork activity is “not currently” captured in audit logs, the Compliance API, or data exports, and points admins to OpenTelemetry for monitoring guidance (Claude Help Center). If a human employee used a shell, copied files, or hit an API, companies would expect logs. If an agent does it inside a VM, “trust us, it is contained” will not survive procurement.

Before-and-after observability diagram: left side shows host EDR seeing only an opaque hypervisor process; right side sh

What Developers Should Demand

The community debate is too focused on whether 1.8 GB is “too much.” That depends on the machine and task. The real demand should be control.

Desktop agent vendors should expose sandbox state as a product surface, not bury it in AppData and Task Manager. Developers and admins should ask for five things.

First: lazy startup. Chat should not boot a VM. Cowork should. Scheduled tasks might justify a warm runtime, but that should be visible and configurable.

Second: a sandbox dashboard. Show VM status, memory, disk use, mounted folders, active sessions, egress policy, and last cleanup. If Docker Desktop can show containers, Claude Desktop can show its agent runtime.

Third: explicit install choices. If Cowork needs a 10 GB-class bundle on some systems, say so before download. Let users choose location. Let them remove it without breaking chat.

Fourth: policy as code. Developers should be able to inspect and version the effective sandbox policy: mounts, network destinations, local MCP permissions, token scopes, and deletion rules. A vague “egress settings” panel is not enough for teams shipping real work.

Fifth: live observability hooks. OTLP export is a start. The bar should be per-tool-call logs, file access events, denied actions, network decisions, session identity, and admin-readable reason codes. EDR blindness cannot be hand-waved away as the cost of isolation.

The Linux ask also belongs here. A Linux desktop app is not just a community nicety. It is a chance to build on the platform where many sandboxing primitives, developer workflows, and container mental models are already native. If the hard part is desktop integration, say that. If the hard part is enterprise support across distros, say that. Silence leaves users to infer neglect.

The Right Takeaway

Anthropic deserves credit for publishing uncomfortable details. The post includes real numbers, real missed risks, and real trade-offs. Most vendors would have stopped at “secure sandbox.” Anthropic explained where the sandbox failed, where user approval failed, and where VM isolation made enterprise monitoring worse.

But the HN anger is also justified. Safety costs do not disappear because the architecture diagram is sound. They move onto the user’s laptop, the admin’s deployment plan, and the developer’s daily workflow.

Claude Cowork’s VM is the future arriving early: local agents will need hard runtime boundaries, scoped identity, network mediation, and auditable tool execution. The winners will not be the vendors who hide that machinery. The winners will make the machinery legible, tunable, observable, and boring.

A desktop agent that can operate your files and tools is no longer just an app. It is a small operating environment. Treat it like one.

Readers who want to try Claude Fable 5 themselves can use it through Claude Fable 5 on OneHop, a drop-in endpoint priced about 30% under list. New accounts can start with $10 free, no card required.

Further reading: Getting started with Claude Fable 5.