AI Agents AI Gadgets & HW AI Models - LLM AI Open Source AI Security AI for Coding AI for Gaming AI for Images AI for Music AI for Videos Artificial Intelligence Editor's Choice NVIDIA AI Other News Robotics Tech Face-off Tech Satire

DeepSeek V4 Targets Huawei Chips in Open-Source Push

By Artūras Malašauskas Apr 25, 2026 5 min read Share:
DeepSeek's new V4 model expands context to 1 million tokens while adding native support for Huawei Ascend processors, challenging Nvidia's AI chip dominance.

Chinese AI startup DeepSeek released the preview version of its latest model, DeepSeek-V4, on April 24, 2026. The company simultaneously open-sourced the model, marking 15 months since the launch of its previous-generation DeepSeek-V3. This release continues DeepSeek's reputation for strong cost-performance while adding support for domestically developed chips, including Huawei's Ascend processors.

The move is being watched closely as a strategic play to strengthen China's independent AI computing ecosystem. It also directly challenges Nvidia's dominance in the AI chip sector. According to reporting from CGTN, Huawei later announced that its full Ascend supernode lineup now supports the DeepSeek V4 series. Ascend A2, A3 and 950 products are compatible with both DeepSeek V4-Flash and DeepSeek V4-Pro models.

DeepSeek released two versions on Friday. The first is a smaller 284 billion parameter Flash mixture-of-experts model with 13 billion active parameters. The larger variant is a 1.6 trillion parameter model, with 49 billion active at any given moment. This architecture choice matters for developers who need to balance performance against infrastructure costs.

The V4 series expands context length from 128K tokens to 1 million tokens. That's nearly a tenfold increase in processing capacity. It enables more advanced long-context tasks that were previously impractical for open-source models. (A million tokens is roughly equivalent to 750,000 words, which is a lot of text to hold in memory.)

Under the hood, DeepSeek V4 introduces several novel architectural changes. The company's researchers describe a hybrid attention mechanism that combines Compressed Sparse Attention and Heavy Compressed Attention. This reduces the compute required during inference and the memory required by KV caches used to track model state.

These caches can be quite large in practice. Inference providers tend to offload them to system memory or flash storage to avoid cold start penalties. More heavily compressed KV caches mean less memory and storage is required for large-scale inference deployments. The result: the model supports a million token context window while using 9.5x-13.7x less memory than DeepSeek V3.2.

DeepSeek is continuing its tradition of using lower precision datatypes. Both V4 models use a mixture of FP8 and FP4 precision. The model developers used quantization-aware training for the MoE expert weights. FP4 effectively halves the memory required to store model weights compared to FP8. That's a significant saving, if you can stomach the loss of precision.

According to The Register, DeepSeek introduced a new optimizer called Muon in V4. It's designed to speed up convergence and improve training stability. The company claims V4-Pro was trained on 33 trillion tokens.

Performance claims should be taken with a grain of salt. DeepSeek has had a strong track record with its V3 and R1 family of models. But just because a model performs well in canned benchmarks doesn't mean it'll hold up in real world applications. Benchmarks don't tell the full story.

DeepSeek's tech report says V4 "falls marginally short of GPT-5.4 and Gemini 3.1 Pro, suggesting a developmental trajectory that trails state-of-the-art frontier models by approximately three to six months." The startup claims its model beats all other open-source models in agentic coding and reasoning.

Pricing is where DeepSeek gets interesting. The V4-Pro model will cost $3.48 for 1 million tokens of output. By comparison, OpenAI and Anthropic charge $30 and $25 respectively for the same amount of work. Even Kimi, from fellow Chinese AI startup Moonshot AI, costs $4 for a million tokens of output.

DeepSeek's V4-Flash costs even less, at just $0.28 for a million tokens. That pricing puts DeepSeek at odds with a trend across the wider AI sector. Both OpenAI and Anthropic have hiked prices and imposed rate limits to manage surging demand. Chinese developers have followed suit, also increasing prices and removing subscriptions that offered unlimited usage.

DeepSeek's prices could get even cheaper. The company expects to lower V4-Pro prices later in the year as Huawei scales up production of its new Ascend 950 AI processors. This creates a feedback loop: more chip production drives down costs, which drives more adoption, which justifies more production.

Market reactions were immediate. Shares in Semiconductor Manufacturing International Corp. jumped 10% in Hong Kong trading. That Chinese chipmaker makes Huawei's Ascend AI processors, which DeepSeek said it used to train its new model. Meanwhile, shares in MiniMax and Knowledge Atlas, two of DeepSeek's competitors, sank by more than 9%.

DeepSeek first captured global attention in December 2024, when it released its V3 large language model. The startup claimed it trained V3 on just $5.6 million worth of processors. AI researcher Andrej Karpathy called it a "joke of a budget." DeepSeek later released R1, a reasoning model that matched the equivalent offering from OpenAI.

That sparked a massive selloff in U.S. tech stocks. Investors repriced how much money was needed to train and run AI models. At one point, tech stocks lost $1 trillion in value. While markets eventually recovered, DeepSeek's decision to release its model on an open-source basis ended up being more significant.

The startup's release built on momentum started by Alibaba's Qwen. It inspired several other Chinese labs to release their own open models. This created a competitive dynamic that's reshaping the global AI landscape.

For developers, the physical reality of using DeepSeek V4 means faster load times and less infrastructure overhead. The compressed attention mechanism reduces the friction of deploying large models at scale. You can actually run this on hardware that doesn't cost a fortune.

Whether users actually pay for it remains the real question. The model is available for download on popular model repos like Hugging Face, the company's API, and web service. Open-source distribution means anyone can audit the code, but it also means anyone can copy it.

DeepSeek's strategy hinges on volume. Lower prices attract more users, which generates more data, which improves the model, which attracts more users. It's a flywheel that only works if the model is actually good enough to justify switching from established providers.

The Ascend chip support is the wildcard here. If Huawei can scale production reliably, DeepSeek gains a competitive moat against U.S. models that depend on Nvidia hardware. If production stumbles, the whole value proposition weakens. Time will tell if the hardware can keep up with the software promises.

Arturas Malas Artūras Malašauskas is an AI Systems Integrator with 20+ years of production-grade web engineering experience. He has designed, shipped, and scaled enterprise Python/PHP systems for logistics, SaaS, and public-sector clients. For the past year, he has focused exclusively on AI integrations: deploying open-source LLMs, building generative media pipelines (image, audio, video), and engineering multi-agent workflows for real production environments. His standard: reproducibility, security, cost-efficient inference—no vaporware. He documents and evaluates emerging AI tooling, separating verified capabilities from marketing noise. Technical editor at: muza-ai.eu, ai-verslas.lt, ai-naujinos.lt Connect on LinkedIn
Share:

Comments

Sign in to comment:
    <