Stories from Simon Willison

Running Python code in a sandbox with MicroPython and WASM

Simon Willison has released micropython-wasm, a sandboxing runtime that executes Python code in WebAssembly with strong isolation guarantees. The tool addresses a critical infrastructure gap for AI agents and data systems that need to safely execute untrusted code, particularly relevant as LLM-powered agents become more autonomous. Willison is already deploying it in Datasette Agent, a system designed to let language models interact with databases programmatically. This bridges the gap between agent capability and operational safety, enabling broader adoption of code-generation workflows in production environments.

Simon Willison·8h ago

72

Products & Apps Policy & Regulation

OpenAI Help: Lockdown Mode

OpenAI has shipped Lockdown Mode, a security feature now live across free and paid tiers that constrains outbound network requests during prompt injection attacks to block data exfiltration. The rollout signals growing industry focus on LLM attack surface hardening as production deployments face real adversarial pressure. While the feature doesn't prevent injection attempts themselves, it represents a concrete defense layer that other providers will likely adopt, reshaping how AI platforms architect safety boundaries between model inference and external systems.

Simon Willison·12h ago

77

Opinion & Analysis

AI enthusiasts are in a race against time, AI skeptics are in a race against entropy

Charity Majors articulates a widening strategic divide in software development: teams aggressively integrating AI tooling are capturing discontinuous capability gains that create genuine competitive moats, while organizations adopting a wait-and-see posture risk obsolescence before the technology matures. This framing resets the adoption calculus from 'hype cycle patience' to 'capability race with real business consequences', suggesting the window for catching up may be narrower than traditional tech transitions allow. The insight matters because it challenges the assumption that skepticism is a defensible stance.

Simon Willison·1d ago

77

Policy & Regulation Opinion & Analysis

Quoting Emanuel Maiberg, 404 Media

Google's internal culture reveals fractures around AI quality and deployment strategy. After 404 Media published employee criticism, Google's PR team requested removal of language emphasizing human oversight from an official statement, signaling tension between safety rhetoric and product velocity. The incident exposes how large labs manage dissent and messaging around AI reliability, with implications for how public commitments to responsible AI deployment actually translate within organizations.

Simon Willison·1d ago

72

Business & Funding Tools & Code

Uber Caps Usage of AI Tools Like Claude Code to Manage Costs

Uber's decision to cap employee token spending at $1,500 monthly signals a critical inflection point in enterprise AI adoption. The company exhausted its entire 2026 coding-agent budget within four months, exposing a fundamental mismatch between traditional cost forecasting and the explosive demand for agentic LLM tools. This constraint reflects a broader tension facing large organizations: AI infrastructure costs are scaling faster than anticipated, forcing real trade-offs between developer productivity gains and operational budgets. The move suggests that token-burning coding agents have moved from experimental to mission-critical, yet remain economically unsustainable at current pricing and usage patterns.

Simon Willison·3d ago

77

Models & Releases Products & Apps

Microsoft's new MAI models

Microsoft is fragmenting its model strategy with two specialized releases: MAI-Thinking-1 targets reasoning workloads at 35B parameters for enterprise partners, while MAI-Code-1-Flash (5B) ships directly into GitHub Copilot's IDE integration. This dual-track approach signals Microsoft's pivot away from monolithic foundation models toward task-specific efficiency, mirroring OpenAI's o1/GPT-4o split. The Code variant's immediate rollout to individual developers matters more than the reasoning model's gated access, as it embeds inference cost reduction directly into the most-used AI development surface.

Simon Willison·3d ago

84

Tools & Code Research

datasette-agent-micropython 0.1a0

Datasette Agent now has a sandboxed Python execution layer built on MicroPython, allowing LLMs to generate and run code without escape risk. Simon Willison reports GPT-5.5 has failed to break the sandbox in early testing, addressing a critical blocker for agentic systems that need safe code generation. This matters because code execution is essential for data querying and automation workflows, but remains a major security surface; a working sandbox unlocks broader deployment of agent-driven data tools without requiring human review of every generated script.

Simon Willison·3d ago

77

Tools & Code Products & Apps

Pasted File Editor

Simon Willison reverse-engineered Claude's file-attachment detection behavior, building a standalone prototype that automatically converts large text pastes into file uploads. The tool also supports direct file opening and drag-and-drop, with image preview thumbnails. This reflects a broader UX pattern emerging across LLM interfaces: treating bulk input as structured attachments rather than inline context, which affects how developers and power users architect prompts and workflows around token efficiency and context window management.

Simon Willison·4d ago

72

Tools & Code Research

micropython-wasm 0.1a0

Simon Willison has released micropython-wasm, an alpha package that compiles MicroPython to WebAssembly and wraps execution through wasmtime. This addresses a growing infrastructure need in AI development: isolated, reproducible Python sandboxes for safely running untrusted code. As LLM applications increasingly need to execute generated Python (from code interpreters to agent frameworks), lightweight WASM-based isolation offers an alternative to container overhead. The release signals momentum in the sandboxing-as-infrastructure space, particularly relevant for teams building agentic systems that must safely evaluate model outputs.

Simon Willison·4d ago

72

Products & Apps Policy & Regulation

Hackers Simply Asked Meta AI to Give Them Access to High-Profile Instagram Accounts. It Worked

Meta's integration of AI into customer support systems created a critical vulnerability: attackers exploited the chatbot's compliance-oriented design to request account takeovers by simply asking. The incident exposes a fundamental tension in deploying LLMs for high-stakes operations without robust authentication layers. This represents a broader infrastructure risk as companies rush to automate support workflows with language models trained to be helpful and accommodating, potentially bypassing human judgment on sensitive requests.

Simon Willison·4d ago

89

Opinion & Analysis

The solution might be cancelling my AI subscription

A prominent AI practitioner reflects on the productivity paradox of modern LLM tooling: frictionless access to Claude and similar systems enables rapid project spawning but systematically undermines focus and problem-solving. The observation surfaces a growing tension in the AI adoption curve: as models become more capable and cheaper to invoke, users report diminishing returns on intentionality and task completion. This challenges the implicit narrative that AI tooling universally accelerates work, suggesting instead that attention management and constraint design may matter more than raw capability for meaningful outcomes.

Simon Willison·5d ago

72

Business & Funding

Quoting Karen Kwok for Reuters Breakingviews

Anthropic's revenue accounting methodology reveals how frontier AI labs are navigating the gap between consumption-based and subscription models in a rapidly scaling market. The company's formula, multiplying 28-day consumption data by 13 and annualizing monthly subscriptions separately, exposes the tension between run-rate projections and actual recurring revenue streams. This accounting choice matters because it signals how AI vendors are managing investor expectations amid volatile customer acquisition patterns and usage volatility, setting a precedent other labs may follow as they approach public markets or major funding rounds.

Simon Willison·6d ago

72

Research Products & Apps

How we contain Claude across products

Anthropic published detailed technical documentation on how it isolates Claude across multiple deployment surfaces, including process sandboxes, virtual machines, filesystem restrictions, and network egress controls. The move addresses a critical gap in the AI industry: most sandbox implementations remain opaque, making it difficult for users and enterprises to assess genuine containment guarantees. By transparently explaining the layered constraints that prevent agents from exceeding their intended scope, Anthropic sets a precedent for security disclosure that could reshape how the field approaches agent safety and user trust in production systems.

Simon Willison·6d ago

84

Opinion & Analysis

I Am Retiring from Tech to Live Offline

Chad Whitacre, a prominent open-source developer, is abandoning tech entirely in response to AI's trajectory, citing it as the breaking point after years of industry strain. His typewritten, deliberately analog departure signals growing burnout among infrastructure builders who feel displaced by rapid AI commoditization and the erosion of craft-oriented work. The move reflects a deeper tension within technical communities: as AI automates and devalues certain skill sets, some veterans are choosing exit over adaptation, raising questions about retention and morale in open-source ecosystems that underpin AI infrastructure.

Simon Willison·6d ago

72

Opinion & Analysis

Quoting Daniel Jalkut

Daniel Jalkut articulates a centrist position on AI adoption that challenges the polarization dominating industry discourse. His framing suggests the productive path forward lies between techno-utopianism and blanket rejection, a stance gaining traction among pragmatist technologists tired of binary framings. This perspective matters because it reflects how informed builders are repositioning themselves as the hype cycle matures and real tradeoffs become visible. For insiders, it signals a potential shift in how the conversation moves from ideological positioning to nuanced capability assessment.

Simon Willison·6d ago

64

Business & Funding

Anthropic's run-rate revenue hits $47 billion

Anthropic's $47 billion annualized run-rate revenue, disclosed in its Series H funding round, signals explosive enterprise adoption momentum since February. The figure represents a critical inflection point for frontier AI commercialization: a single LLM provider now operates at revenue scales historically reserved for mature software giants. Simon Willison's analysis flags Anthropic's pattern of publishing run-rate metrics as a strategic communication choice, suggesting the company is signaling sustained demand velocity to investors and competitors alike. This matters because it establishes a new baseline for what venture-scale AI infrastructure can generate in near-term revenue, reshaping investor expectations across the sector.

Simon Willison·May 29

97

Models & Releases Opinion & Analysis

Claude Opus 4.8: "a modest but tangible improvement"

Anthropic released Claude Opus 4.8 with notably candid framing: positioning it as incremental rather than revolutionary. The lab's explicit acknowledgment that meaningful capability gains remain elusive, paired with stated focus on cost reduction over raw performance, signals a maturation in how frontier labs communicate model releases. This transparency contrasts sharply with industry norm and hints at shifting competitive dynamics where efficiency and honest positioning may matter as much as benchmark leaps.

Simon Willison·May 28

72

Tools & Code Models & Releases

llm-anthropic 0.25.1

The llm-anthropic plugin now supports Claude Opus 4.8, Anthropic's latest model, alongside a fast-mode option for qualifying organizations and smarter token defaults. The token-limit change is particularly significant for developers: instead of capping outputs at 8,192 tokens regardless of model capability, the tool now respects each model's native maximum, reducing friction for use cases requiring longer generations. This incremental but practical update reflects how tooling around frontier models evolves to match their capabilities.

Simon Willison·May 28

72

Tools & Code Policy & Regulation

sqlite AGENTS.md

SQLite published an AGENTS.md file to guide AI systems interacting with its codebase, signaling institutional recognition that LLM-powered code agents now warrant explicit governance. The document clarifies SQLite's contribution policy for automated systems, requiring proof-of-concept submissions rather than direct pull requests and mandating public domain licensing. This reflects a broader infrastructure shift: foundational open-source projects must now establish norms for agent-driven development workflows, creating friction points between autonomous coding systems and traditional maintainer control.

Simon Willison·May 27

72

Business & Funding Opinion & Analysis

I think Anthropic and OpenAI have found product-market fit

Anthropic's path to profitability and rising enterprise LLM costs signal that Claude and GPT have crossed a critical threshold: widespread adoption at scale. When companies begin discovering surprise API bills from routine staff usage, it indicates these tools have moved beyond experimental pilots into embedded workflows. This shift matters because it validates the core business model for frontier labs and suggests the market has matured enough to sustain both players through genuine demand rather than hype cycles. For investors and builders, it signals the era of LLM commoditization is underway.

Simon Willison·May 27

84

Opinion & Analysis Tools & Code

The pressure

The curl maintainer reports a four to five-fold surge in AI-generated security vulnerability reports since 2024, now averaging over one credible submission daily. The shift reflects a structural change in how LLMs are being deployed for automated security auditing: higher-quality, more detailed findings are flooding open-source projects with finite review capacity. This exposes a critical tension in the AI-assisted security landscape: while LLM-powered vulnerability discovery accelerates threat detection, it simultaneously strains the human gatekeepers who validate and triage findings, raising questions about sustainable incident response at scale.

Simon Willison·May 26

77

Products & Apps Policy & Regulation

Microsoft Copilot Cowork Exfiltrates Files

Microsoft's Copilot Cowork agent system contained a critical vulnerability allowing unapproved email dispatch that could leak sensitive data through rendered message images. The flaw exposes a core tension in agentic AI design: sandboxing agent actions without restricting legitimate workflows. This incident underscores why autonomous systems remain high-risk in enterprise settings and validates concerns about agent-based architectures outpacing security controls.

Simon Willison·May 26

89

Opinion & Analysis

Quoting Paul Graham

Paul Graham's observation that AI-written founder emails are now recognizable and off-putting signals a shift in how generative tools are perceived by influential gatekeepers. Graham frames AI-assisted communication as deceptive rather than augmentative, suggesting that reliance on LLMs for high-stakes outreach may backfire with experienced investors who view it as a proxy for weak writing ability. This touches a nerve in startup culture: if founders can't pitch authentically without AI, what does that say about their judgment? The dynamic reveals tension between AI adoption and credibility in contexts where personal voice and directness carry outsized weight.

Simon Willison·May 26

77

Opinion & Analysis Policy & Regulation

Quoting Corey Quinn

Anthropic co-founder Christopher Olah's involvement in shaping a papal encyclical on AI ethics has drawn sharp commentary from industry observers. Corey Quinn's quip highlights a strategic inflection point: when foundational model builders gain influence over religious and moral frameworks around AI limitations, they effectively legitimize technical constraints as ethical doctrine rather than engineering tradeoffs. This blurs the line between genuine safety advocacy and sophisticated reputation management, raising questions about whose values get encoded into the emerging governance layer around AI systems.

Simon Willison·May 26

77

Policy & Regulation Opinion & Analysis

Notes on Pope Leo XIV's encyclical on AI

The Vatican released Magnifica Humanitas, a papal encyclical addressing AI ethics and human dignity in the age of artificial intelligence. Simon Willison flags it as notably coherent institutional thinking on AI integration into society, positioning the Church as a substantive voice in the policy conversation alongside governments and tech firms. The document echoes Pope Leo XIII's 1891 labor encyclical, framing AI governance as a continuation of Catholic social doctrine on protecting workers and human agency in systems of production.

Simon Willison·May 25

77

Tools & Code Products & Apps

datasette-agent 0.1a4

Datasette-agent, an AI chat interface for querying databases, now integrates directly into Datasette's navigation layer via a new JavaScript plugin hook. The 0.1a4 release leverages Datasette 1.0a30's makeJumpSections() API to surface agent chat as a keyboard-accessible command (slash menu), embedding agentic AI workflows into developer tooling rather than requiring separate interfaces. This reflects a broader shift toward embedding LLM agents into existing infrastructure and developer workflows, reducing friction for data exploration tasks.

Simon Willison·May 24

67

Opinion & Analysis Research

Quoting Armin Ronacher

Armin Ronacher, maintainer of Pocoo projects, identifies a critical failure mode in open-source issue reporting: LLM-generated submissions that obscure rather than clarify problems. These AI-reworded reports trade accuracy for false confidence, producing speculative root causes, unreproducible test cases, and misaligned code analogies. The pattern signals a growing friction point where LLM intermediation degrades signal quality in collaborative software development, forcing maintainers to spend cycles filtering noise rather than solving genuine bugs.

Simon Willison·May 24

77

Hardware & Infra Business & Funding

The memory shortage is causing a repricing of consumer electronics

Memory chip capacity constraints are reshaping AI infrastructure economics. With only three major manufacturers controlling global supply, HBM (high-bandwidth memory) demand from GPU makers is crowding out DDR and LPDDR allocation, forcing a fundamental repricing across consumer and enterprise hardware. This supply bottleneck directly throttles AI deployment at scale, making memory allocation a critical competitive lever for cloud providers and chip designers over the next several years.

Simon Willison·May 22

84

Illustration for: FTC to Require Cox Media Group, Two Other Firms to Pay Nearly $1 Million to Settle Charges They Deceived Customers About “Active Listening” AI-Powered Marketing Service

Policy & Regulation Business & Funding

FTC to Require Cox Media Group, Two Other Firms to Pay Nearly $1 Million to Settle Charges They Deceived Customers About “Active Listening” AI-Powered Marketing Service

The FTC's settlement with Cox Media Group and two unnamed firms over deceptive 'active listening' AI marketing claims signals regulatory teeth around voice-data collection practices. The 2024 pitch deck promised real-time intent capture from smart devices, a claim the agency found unsubstantiated. This enforcement action matters because it establishes that vendors cannot market speculative AI capabilities as proven features to advertisers, setting precedent for how regulators will police the gap between AI marketing hype and actual technical delivery in the adtech ecosystem.

Simon Willison·May 22

77

Products & Apps Tools & Code

Datasette Agent

Simon Willison's Datasette Agent merges three years of LLM library development with Datasette's data exploration platform, creating a conversational interface for querying structured data. The release marks a convergence of two mature open-source projects into a unified AI assistant that can answer natural-language questions and generate visualizations. This represents a practical application layer where LLMs become operational tools for data teams, rather than standalone chat interfaces, and signals how domain-specific AI assistants are moving beyond chatbots into embedded workflows.

Simon Willison·May 21

72