Microsoft AI has released three new foundational models - MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 - available through its Foundry platform. The models handle speech transcription, voice generation, and image creation respectively, and are priced to undercut both OpenAI and Google.
Three Models, One Platform
Microsoft AI announced the release of three foundational models on its Foundry platform this week: MAI-Transcribe-1 for speech-to-text transcription, MAI-Voice-1 for audio generation, and MAI-Image-2 for image creation. The models were developed by the MAI Superintelligence team, a research group formed in November 2025 under the leadership of Mustafa Suleyman, CEO of Microsoft AI and co-founder of Google DeepMind.
MAI-Transcribe-1 handles speech transcription across 25 languages and is 2.5 times faster than Microsoft's previous Azure Fast transcription service. MAI-Voice-1 generates audio - including custom voices - and can produce 60 seconds of speech in a single second. MAI-Image-2, originally released on MAI Playground in March, creates images from text prompts and is now also available through Foundry.
Priced to Undercut OpenAI and Google
Microsoft is explicitly positioning these models as cheaper alternatives to comparable offerings from OpenAI and Google. MAI-Transcribe-1 starts at $0.36 per hour. MAI-Voice-1 starts at $22 per million characters. MAI-Image-2 costs $5 per million tokens for text input and $33 per million tokens for image output.
For developers building applications that rely on transcription, voice synthesis, or image generation, the pricing difference matters. A startup processing thousands of hours of meeting transcriptions per month, or an AI-powered small business tool generating product images at scale, could see meaningful cost savings by switching from OpenAI or Google APIs to Microsoft Foundry.
Users evaluating AI platforms on price may also want to compare how free AI tools stack up against paid software - particularly for tasks where open-source or freemium alternatives already exist.
The OpenAI Relationship Gets More Complicated
The release adds another layer to Microsoft's already complex relationship with OpenAI. Microsoft has invested more than $13 billion into the AI research lab and hosts OpenAI's models across its product lineup, including Copilot, Azure, and Bing. But a recent renegotiation of the partnership gave Microsoft more latitude to develop and ship competing models.
In a VentureBeat interview, Suleyman reaffirmed the OpenAI partnership while making clear that Microsoft intends to build its own frontier AI capabilities. The company takes the same dual-track approach with hardware - it designs its own AI chips while continuing to purchase from NVIDIA and AMD.
The dynamic mirrors what is happening across the AI industry. As models become commoditized for many standard tasks - transcription, basic image generation, voice synthesis - the strategic value of exclusive partnerships declines. Microsoft appears to be hedging accordingly.
What It Means for Developers and Users
For developers already working within the Microsoft ecosystem, the new models provide native alternatives to third-party APIs. Foundry integration means these models plug directly into Azure infrastructure, which simplifies authentication, billing, and compliance for enterprise customers.
For everyday users, the impact will be indirect but potentially significant. Microsoft has indicated that MAI models will eventually appear in its consumer products. Voice features in Teams, transcription in Word, and image generation in Designer could all benefit from faster and cheaper underlying models.
Developers exploring no-code AI app building may also find new capabilities as Foundry's model catalog expands - particularly for applications that combine voice, text, and image generation in a single workflow.
The Bigger Picture
Microsoft's move reflects a broader shift in the AI industry: the major cloud providers are all building full-stack AI capabilities rather than relying on a single model partner. Google has Gemini. Amazon has its partnership with Anthropic plus its own Nova models. Microsoft now has both OpenAI and its own growing MAI family.
The competitive landscape for AI productivity tools is evolving rapidly. As model costs drop and performance converges, the differentiator increasingly becomes distribution - which platform can put the right model in front of the right user at the right time. With 400 million monthly active Microsoft 365 users, Microsoft's distribution advantage remains formidable.
For a broader view of how the leading AI platforms compare on features and pricing, see our Claude vs ChatGPT comparison for 2026.
Source: TechCrunch ยท Microsoft AI Blog
Frequently Asked Questions
What are the three new Microsoft AI models?
The three models are MAI-Transcribe-1 for speech-to-text transcription across 25 languages, MAI-Voice-1 for generating custom audio and voices, and MAI-Image-2 for creating images from text prompts. All three are available through Microsoft Foundry, and the transcription and voice models are also accessible via MAI Playground.
How much do the new Microsoft AI models cost?
MAI-Transcribe-1 starts at $0.36 per hour. MAI-Voice-1 starts at $22 per million characters. MAI-Image-2 starts at $5 per million tokens for text input and $33 per million tokens for image output. Microsoft has positioned these prices below comparable offerings from OpenAI and Google.
Who leads Microsoft AI and the MAI Superintelligence team?
Mustafa Suleyman, co-founder of DeepMind and CEO of Microsoft AI, leads the division. The MAI Superintelligence team was formed in November 2025 specifically to develop frontier AI models. Suleyman has described the team's approach as "Humanist AI" - putting practical human communication at the center of model design.
Does this affect Microsoft's partnership with OpenAI?
Microsoft says it remains committed to its OpenAI partnership, which has included more than $13 billion in investment. However, a recent renegotiation of that partnership gave Microsoft more freedom to develop and deploy its own competing models. The company takes a similar dual approach with AI chips - building its own while also purchasing from NVIDIA and AMD.
How does MAI-Image-2 compare to free AI image generators?
MAI-Image-2 is an API-first model aimed at developers and enterprises, not a consumer tool. It is priced per token rather than offered as a free creative tool. Users looking for free options can explore standalone alternatives to Midjourney or experiment with free AI video generation tools that also include image capabilities.
The Bottom Line
Microsoft releasing its own foundational models while maintaining a $13 billion partnership with OpenAI signals how the AI market is fragmenting at the platform level. The company is betting that developers want cheaper, faster alternatives for commodity tasks like transcription and voice generation - and that Microsoft can deliver those without breaking its OpenAI relationship. For everyday users, the direct impact is limited for now. But as these models filter into Microsoft 365, Copilot, and Azure services, the performance and pricing benefits should reach a much wider audience.
Continue reading related coverage in News or browse all stories on the articles page.