Poolside Launches Open-Weight Laguna XS.2 for Local Agentic Coding
San Francisco-based AI lab Poolside has released two new foundation models focused on agentic coding, with the smaller Laguna XS.2 available as open weights under an Apache 2.0 license. The announcement marks the company's first public model release after years of serving government and public sector clients in high-security environments.
According to the official blog post, Laguna XS.2 contains 33 billion total parameters with 3 billion activated, running on a single GPU. This makes it accessible for local deployment without requiring cloud infrastructure or internet connectivity. The proprietary Laguna M.1, by contrast, is a 225B parameter Mixture of Experts model with 23B active parameters designed for enterprise and government workloads.
Both models are free to use temporarily via Poolside's API and through OpenRouter. The XS.2 weights are downloadable from Hugging Face starting today. This open-weight release represents a strategic shift for Poolside, which has historically operated behind closed doors with classified and public sector contracts.
The physical reality of running XS.2 locally matters. Developers can execute the model through Ollama with native MLX support on Mac systems, or deploy it on Linux machines with compatible GPU hardware. There's no waiting for API responses, no network latency, no data leaving your machine. The terminal just responds when you type commands (which is how it should work, honestly).
Poolside's technical documentation reveals the models were trained from scratch on 30 trillion tokens using the company's internal "Model Factory" infrastructure. The training utilized 6,144 interconnected NVIDIA Hopper GPUs for M.1, with XS.2 incorporating architectural improvements learned during that process. The Muon optimizer, Poolside's custom training tool, reportedly accelerates learning by approximately 15% compared to standard methods.
Data curation employed AutoMixer, a system that uses sixty proxy models to test different combinations of code, mathematics, and web data. About 13% of the training corpus consists of synthetic data—artificially generated practice material designed to teach skills difficult to find in public datasets. This approach differs from competitors who fine-tune existing base models from Chinese labs like Alibaba's Qwen series.
Benchmark performance shows XS.2 scoring 44.5% on SWE-bench Pro and 30.1% on Terminal-Bench 2.0. These metrics place it competitively against models several times its size. M.1 achieves 46.9% on SWE-bench Pro and 40.7% on Terminal-Bench 2.0, though it requires substantially more compute resources to operate.
Independent reporting from VentureBeat confirms the launch timeline and licensing terms. The outlet notes Poolside's positioning against proprietary models from Anthropic, OpenAI, and Google, emphasizing the open-weight advantage for developers who need to inspect, modify, or deploy models in isolated environments.
Two accompanying products entered preview alongside the models: "pool," a terminal-based coding agent, and "Shimmer," a cloud development environment for iterating on web applications and APIs. The agent harness uses an Agent Client Protocol server that Poolside also employs for reinforcement learning training and evaluation.
Post-training engineer George Grigorev explained on X that government agencies value Poolside's ability to ship weights into fully isolated, on-premises environments. This capability addresses security requirements that cloud-based proprietary models cannot satisfy. The trade-off is clear: you get full control and offline operation, but you also manage your own infrastructure.
The Apache 2.0 license grants commercial use rights without requiring attribution or sharing modifications. This differs from more restrictive open-source licenses that mandate derivative works remain open. Poolside explicitly stated they want to see what the community builds with XS.2, suggesting they expect forks, quantizations, and specialized fine-tunes.
Technical specifications include temperature=0.7 and top_k=20 sampling parameters across all benchmarking. The Laude Institute's Harbor Framework evaluated performance using sandboxed execution with 8 GB RAM and 2 CPUs for most tests. Terminal-Bench 2.0 required 48 GB RAM and 32 CPUs, reflecting the heavier computational demands of terminal-based agent tasks.
Poolside's Applied Research organization, approximately 60 people across infrastructure, architecture, data, pre-training, and reinforcement learning, built these models. The team emphasized that creating software represents the core skill through which other agent capabilities express themselves. An agent that can write and execute code can compose actions, parallelize work, and build ad-hoc systems.
Current limitations include the temporary nature of free API access and the substantial hardware requirements for M.1. XS.2's single-GPU operation makes it practical for individual developers, but the 33B parameter count still demands a modern GPU with sufficient VRAM. Running it on older hardware means quantization or cloud deployment, which somewhat defeats the privacy advantage.
Whether the open-weight release attracts meaningful community contribution remains uncertain. Many open models languish after initial downloads, with few developers actually building production systems on top of them. The Apache 2.0 license helps, but ecosystem momentum requires more than permissive licensing alone.
Poolside's government background raises questions about future model capabilities. If M.1 represents their current frontier, what improvements might come from continued private-sector work? The company indicated plans to iterate on the Laguna family, suggesting XS.2 is just the beginning of their public-facing model lineup.
For developers evaluating the release, the decision tree is straightforward. If you need maximum capability and have enterprise resources, M.1 via API makes sense. If you need local deployment, privacy, or the ability to modify weights, XS.2 is the choice. If you're building an agent system that requires long-horizon planning, both models offer capabilities beyond simple code generation.
The market response will determine whether Poolside's open-weight strategy gains traction. Competitors like Meta with Llama and Google with Gemma have established open ecosystems. Poolside enters this space with a narrower focus on agentic coding, which could be either a strength or a limitation depending on developer needs.
Whether users actually pay for the API once the free period ends remains the real question.
Artūras Malašauskas is an AI Systems Integrator with 20+ years of production-grade web engineering experience. He has designed, shipped, and scaled enterprise Python/PHP systems for logistics, SaaS, and public-sector clients. For the past year, he has focused exclusively on AI integrations: deploying open-source LLMs, building generative media pipelines (image, audio, video), and engineering multi-agent workflows for real production environments. His standard: reproducibility, security, cost-efficient inference—no vaporware. He documents and evaluates emerging AI tooling, separating verified capabilities from marketing noise. Technical editor at: muza-ai.eu, ai-verslas.lt, ai-naujinos.lt Connect on LinkedIn
Artūras Malašauskas is an AI Systems Integrator with 20+ years of production-grade web engineering experience. He has designed, shipped, and scaled enterprise Python/PHP systems for logistics, SaaS, and public-sector clients. For the past year, he has focused exclusively on AI integrations: deploying open-source LLMs, building generative media pipelines (image, audio, video), and engineering multi-agent workflows for real production environments. His standard: reproducibility, security, cost-efficient inference—no vaporware. He documents and evaluates emerging AI tooling, separating verified capabilities from marketing noise. Technical editor at: muza-ai.eu, ai-verslas.lt, ai-naujinos.lt
Comments