Sarvam AI Open-Sources 30B and 105B Reasoning Models
Sarvam AI has officially open-sourced its Sarvam 30B and Sarvam 105B models, representing a major milestone in India's push for sovereign artificial intelligence infrastructure. Both models were trained from scratch on high-quality datasets curated in-house across all training stages—pre-training, supervised fine-tuning, and reinforcement learning—using compute resources provided under India's IndiaAI mission.
The release, announced on March 6, 2026, positions Sarvam as the first Indian AI company to develop frontier-class reasoning models entirely within the country. The Sarvam 30B powers Samvaad, the company's conversational agent platform, while the Sarvam 105B drives Indus, an AI assistant designed for complex reasoning and agentic workflows. Both models achieve state-of-the-art results on Indian language benchmarks, outperforming significantly larger models in their class.
Technical innovation defines these models' architecture. Both employ a Mixture-of-Experts (MoE) Transformer backbone with sparse expert routing, allowing them to scale parameter count without proportionally increasing compute requirements per token. The Sarvam 30B uses Grouped Query Attention (GQA) to reduce KV-cache memory while maintaining performance, whereas the Sarvam 105B extends this with Multi-head Latent Attention (MLA) for optimized long-context inference. The 105B model features a 128K token context window, enabling it to process extensive documents and multi-step reasoning chains without truncation.
Training data spans 16 trillion tokens for the 30B model and 12 trillion tokens for the 105B, covering code, general web content, specialized knowledge corpora, mathematics, and multilingual content. Crucially, Sarvam developed all components of the training pipeline in-house, including data curation, synthetic data generation, and reinforcement learning infrastructure, ensuring full control over data quality and training dynamics.
Availability is a key focus of the release. Both models are accessible via Sarvam's API dashboard, with weights available for download on AI Kosh and Hugging Face. The models are released under the Apache 2.0 license, permitting commercial use, fine-tuning, and redistribution without restrictions. Sarvam emphasizes that these models are "globally competitive for their class," with the 105B performing strongly on reasoning, programming, and agentic tasks across standard benchmarks.
India's AI landscape has long been dominated by Western and Chinese models, but Sarvam's approach represents a deliberate shift toward indigenous development. As noted in a Sarvam blog post, the company rejected the common practice of fine-tuning Western models for Indian languages, instead integrating Indian language data directly into the pretraining corpus. This strategy has yielded models that excel in multilingual contexts, particularly for India's 22+ languages.
The release aligns with India's broader AI strategy, including the IndiaAI mission that provided infrastructure support. Sarvam collaborated with data center operator Yotta and received technical support from Nvidia, though the training was conducted entirely on domestic GPU clusters. This full-stack development—from data to deployment—marks a significant capability leap for Indian AI, moving beyond incremental improvements to foundational model building.
For developers, the models offer practical deployment advantages. Sarvam optimized tokenization, model architecture, execution kernels, and inference systems to enable efficient operation across diverse hardware, from high-end GPUs to personal laptops. The company has already integrated the 30B model into production for real-time conversational applications, while the 105B targets complex reasoning tasks requiring extended context.
Sarvam's approach contrasts with global competitors focusing on raw parameter count. Co-founder Pratyush Kumar emphasized their measured scaling philosophy: "We want to be mindful in how we do the scaling. We don't want to do the scaling mindlessly. We want to understand the tasks which really matter at scale and go and build for them." This strategy positions Sarvam to develop specialized models for coding, agentic workflows, and multimodal conversational tasks, building on the foundation established by these open-source releases.
Artūras Malašauskas is an AI Systems Integrator with 20+ years of production-grade web engineering experience. He has designed, shipped, and scaled enterprise Python/PHP systems for logistics, SaaS, and public-sector clients. For the past year, he has focused exclusively on AI integrations: deploying open-source LLMs, building generative media pipelines (image, audio, video), and engineering multi-agent workflows for real production environments. His standard: reproducibility, security, cost-efficient inference—no vaporware. He documents and evaluates emerging AI tooling, separating verified capabilities from marketing noise. Technical editor at: muza-ai.eu, ai-verslas.lt, ai-naujinos.lt Connect on LinkedIn
Artūras Malašauskas is an AI Systems Integrator with 20+ years of production-grade web engineering experience. He has designed, shipped, and scaled enterprise Python/PHP systems for logistics, SaaS, and public-sector clients. For the past year, he has focused exclusively on AI integrations: deploying open-source LLMs, building generative media pipelines (image, audio, video), and engineering multi-agent workflows for real production environments. His standard: reproducibility, security, cost-efficient inference—no vaporware. He documents and evaluates emerging AI tooling, separating verified capabilities from marketing noise. Technical editor at: muza-ai.eu, ai-verslas.lt, ai-naujinos.lt
Comments