Blog - Product
Jul 15, 2025 | Read time 4 min

Speechmatics launches world’s first bilingual voice AI models for Southeast Asia

Voice AI systems built for the region’s real world of code-switching and multilingual communication.
Yahia AbazaSenior Product Manger

Here's the thing about conversations in Southeast Asia: they rarely stay in one language. Take Singapore, for example - someone calling into a contact center might start in Malay and pivot to English for a technical explanation, all the while mixing in words and phrases from Mandarin.

Or an emergency dispatch,  which needs to handle callers naturally switching between Tamil and English in a high-pressure situation.

This linguistic fluidity is how millions of people in the region communicate. And until now, voice AI has been spectacularly bad at keeping up with this reality.

The bilingual breakthrough

The industry has only supported these languages individually, and without the region-specific subtleties that are needed for real-world Southeast Asian conversations.

Speechmatics set out to solve this with the world's first bilingual models specifically tailored for the region. Our three new models - Mandarin-English, Malay-English, and Tamil-English, deliver breakthrough performance in real-world scenarios.

The results: more than 60% improvement for Singaporean English and 15% improvement in code-switching scenarios compared to the nearest competitor.

Why general-purpose models trade accuracy for scale

The problem with general-purpose multilingual models? They may support a multitude of languages but they treat each language as a separate entity, struggling when speakers blend them naturally.

We took a different approach. Working with local partners, we trained entirely new models on region-specific datasets that capture how people actually speak across the region.

The breakthrough lies in understanding code-switching as natural communication rather than an error to be corrected.

When someone says that “other models 纸面上看起来不错, but in the real world 他们跟不上”,  our AI follows the conversation seamlessly.

The performance results speak for themselves:

  • 60%+ improvement for Singaporean English

  • 15% better accuracy in code-switching scenarios compared to the nearest competitor

  • Enhanced baseline performance for Malay and Tamil

  • Regional context awareness for Southeast Asian English variants

The result is voice AI that maintains high accuracy precisely because it understands how these language pairs work together.

Our specialized approach also delivers real-time transcription capabilities that general-purpose models, designed for batch processing, simply can't match.

There's no free lunch in machine learning. By focusing on specific language pairs and regional patterns, we achieve accuracy levels that broader models sacrifice for coverage.

Industries ready for change

We've been testing these models with select preview partners across emergency services, call centers, and law enforcement.

Early results show faster resolution times, fewer transcription errors, and improved customer satisfaction. A detailed case study is coming soon.

Building on our proven Spanish-English bilingual model, the Mandarin-English, Malay-English, and Tamil-English models are ready for deployment across key industries:

  • Emergency Services: Accurate transcription regardless of language switches

  • Contact Centers: Natural agent communication without quality loss

  • Media & Broadcasting: Real-time multilingual content production

  • Government: Inclusive citizen engagement across languages

For enterprise deployment, we offer on-premises options for strict data governance, HIPAA compliance for healthcare applications, and zero data retention for sensitive environments.

Available now

The Southeast Asia bilingual models are live on Speechmatics' platform today, with full API documentation and enterprise deployment support.

When voice AI starts appreciating local context, users feel much better understood, and agent workflows are much more efficient.

Ready to see what that looks like? Speak to our team today.

Power your products with enterprise-grade Voice AI

We handle the speech, you deliver conversations that matter.