Microsoft evaluates DeepSeek V3 for Copilot to cut agent costs, will offer cheaper tier within weeks
Microsoft is evaluating a self-hosted version of DeepSeek V3 to power Copilot Cowork as agent costs spiral. The company plans to launch a lower-cost tier within weeks while moving to usage-based pricing, charging enterprises for actual compute consumed rather than flat fees.
Microsoft evaluates DeepSeek V3 for Copilot to cut agent costs, will offer cheaper tier within weeks
Microsoft is exploring a self-hosted, fine-tuned version of DeepSeek V3 or another open-source model to power Copilot Cowork, the agentic assistant in its Microsoft 365 suite, according to statements to Axios. The company expects to offer a lower-cost model tier within weeks.
The move comes as Microsoft simultaneously shifts Copilot Cowork to usage-based pricing, charging companies for actual compute consumption rather than flat subscription fees.
Agent economics force the shift
Agentic AI tools like Copilot Cowork call language models repeatedly as they work through tasks, creating costs that scale rapidly with usage.
"We have users who do hundreds of tasks a week, which is great, they're way productive, but the consequence is the costs can go very high," said Charles Lamanna, Microsoft's executive vice president for Copilot, agents and platform.
Copilot Cowork currently runs on Anthropic and OpenAI models. Both providers have raised prices and moved away from unlimited usage plans. Microsoft previously metered GitHub Copilot for similar cost reasons.
DeepSeek offers inference cost advantage
DeepSeek V3, released in December 2024, offers significantly lower inference costs compared to frontier models while maintaining competitive performance. The model is open-source and popular with developers seeking cost-effective options.
Microsoft says any DeepSeek deployment would be optional for customers and fully hosted on Azure, maintaining data inside Microsoft's cloud infrastructure under its security and compliance controls. The company claims it has fine-tuned the model and added safeguards, including bias reduction measures.
Political complications
The timing presents political challenges. Washington has discussed banning DeepSeek, sanctioned Chinese AI firms, and recently forced Anthropic to restrict its top models for non-US users in a dispute that required Commerce Department intervention.
Multi-model strategy emerges
The evaluation signals Microsoft's broader shift toward a multi-model approach, reducing dependence on any single AI lab. This marks a strategic change from its exclusive, often tense relationship with OpenAI.
Microsoft emphasized this remains an evaluation, not a final decision. The company will confirm its model selection when the cheaper tier launches.
What this means
The economics of agentic AI are forcing even Microsoft to reconsider its infrastructure choices. When a company with deep pockets and close ties to both OpenAI and Anthropic considers Chinese open-source models for cost reasons, it reveals how unsustainable current agent pricing has become at scale. Microsoft's willingness to publicly name DeepSeek as a candidate, despite the political environment, underscores the financial pressure. The shift to usage-based pricing and cheaper model options will likely accelerate across the industry as agents move from demos to production workloads.
Related Articles
GitHub Documents Copilot CLI Slash Commands for Terminal Control
GitHub published documentation outlining slash commands for Copilot CLI, the company's terminal-based AI coding assistant. The guide targets developers new to using AI agents directly in command-line environments.
Microsoft Releases FastContext-1.0: 4B-Parameter Repository Explorer Cuts Coding Agent Token Use by 60%
Microsoft released FastContext-1.0, a lightweight repository-exploration subagent for LLM coding agents spanning 4B to 30B parameters. The model reduced main-agent token consumption by up to 60% while improving end-to-end resolution rates by up to 5.5% on SWE-bench Pro when integrated with agents like GPT-5.4 and GLM-5.1.
GitHub Copilot CLI reduces unnecessary model handoffs with improved orchestration logic
GitHub has updated Copilot CLI to reduce unnecessary handoffs between AI models. The improvement delivers faster command execution through better orchestration logic, implemented without adding new user configuration options.
GitHub Copilot CLI reduces unnecessary LLM handoffs through improved orchestration logic
GitHub has updated the orchestration logic in Copilot CLI to make it more selective about when to delegate tasks between language models. The changes reduce unnecessary handoffs and improve response times without introducing additional configuration settings.
Comments
Loading...