Jan 2, 2025 | Read time 4 min

Speechmatics Collaborates With Ambarella to Bring AI-Powered Natural Language Interactions to Edge Applications

Demonstration During CES to Feature Speechmatics’ Flow Conversational-AI Engine Running on Ambarella’s Power-Efficient Edge AI SoCs
Ambarella blog header
Speechmatics
SpeechmaticsEditorial team

Speechmatics, a world leader in AI-powered speech technology, today announced a partnership with Ambarella (NASDAQ: AMBA), an edge AI semiconductor company.

Speechmatics' technology running on Ambarella’s robust, low-power portfolio of CVflow® AI system-on-chips (SoCs) provides machines with groundbreaking capabilities to process complex speech and visual inputs on the fly. The companies will jointly demonstrate this technology during CES next week, running locally on Ambarella’s AI SoCs, without an internet connection.

By combining Ambarella’s edge AI SoCs—which provide industry leading AI performance per watt—with Speechmatics’ foundational speech technology—which excels at understanding diverse accents, languages and contexts—users can now experience seamless, natural device interactions; even in environments without internet connectivity.

This collaboration has significant implications for multiple applications, including advanced robotics, autonomous driving, automotive in-cabin systems, smart cities, security and customer service.

For instance, autonomous warehouse robots could combine visual object recognition with natural voice commands, allowing for more efficient and dynamic workflows. Similarly, in customer-facing scenarios, kiosks and smart assistants could respond to both verbal and visual cues to provide a more personalized and engaging experience. Other applications include voice-activated assistants in remote locations, adaptive smart cameras that respond to voice and visual commands, as well as in-vehicle voice commands and verbal feedback.

“Ambarella is at the forefront of edge AI computing innovation,” said Amit Badlani, Director of Generative AI and Robotics at Ambarella. “Our partnership with Speechmatics opens a new world of possibilities for natural language understanding at the edge.”

“Speechmatics’ conversational AI product, Flow, supports a wide range of speech-to-speech deployments, from on-camera to robotics and larger on-premise deployments in smart city use cases,” said Katy Wigdahl, CEO of Speechmatics. “This means users can benefit from the low latency and privacy intrinsic to edge computing, whilst still gaining the huge value of natural language interactions. It also gives users tight control over costs, which can be unpredictable with cloud deployments. This collaboration will redefine what’s possible in the fields of autonomous machines, smart cities and customer service.”

Speechmatics’ technology is renowned for its ability to accurately understand speech in over 50 languages, regardless of accents or dialects. With the recent launch of Flow, they have now moved into the world of voice-powered AI interactions.

Flow perfectly complements Ambarella’s powerful AI processors, creating seamless interactions between machines and their environments. Together, these technologies enable applications such as voice-commanded industrial robots, automated customer-engagement kiosks, and intelligent monitoring systems.

Wigdahl continued, “This partnership marks an exciting step forward for human-machine interaction. Speechmatics is supported on Ambarella’s entire portfolio of CVflow AI SoCs, which enables a huge range of devices with voice interactivity. We’re thrilled to work together to drive innovation in the edge AI space.”

“This is just the beginning,” added Badlani. “Ambarella is committed to advancing edge AI technologies, and we see this partnership as a launchpad for creating smarter, more adaptive solutions across robotics, industrial automation and smart cities.”

Ambarella and Speechmatics will be jointly demonstrating this technology at Ambarella’s invitation-only exhibition during CES in Las Vegas next week. Contact your Ambarella or Speechmatics representative to schedule a meeting at this exclusive event.

About Speechmatics Speechmatics is a leading provider of automatic speech recognition technology, enabling organizations to unlock the power of voice. With best-in-class accuracy and language coverage, Speechmatics powers speech-enabled solutions worldwide.

Foundational Speech Technology for the AI era

Build incredible AI applications powered by voice

Latest Articles

[alt: Bilingual medical model featuring terms related to various health conditions and medications in Arabic and English. Key terms include "Chronic kidney disease," "Heart attack," "Diabetes," and "Insulin," among others, displayed in an organized layout.]
Product

Speechmatics achieves a world first in bilingual Voice AI with new Arabic–English model

Sets a new accuracy bar for real-world code-switching: 35% fewer errors than the closest competitor.

Speechmatics
SpeechmaticsEditorial Team
[alt: Illuminated ancient mud-brick structures stand against a dusk sky, showcasing architectural details and textures. Palm trees are in the foreground, adding to the setting's ambiance. Visually captures a historic site in twilight.]
Product

Your voice agent speaks perfect Arabic. That's the problem.

Most voice AI models are trained on formal Arabic, but real conversations across the Middle East mix dialects and English in ways those systems aren’t built to handle.

Yahia Abaza
Yahia AbazaSenior Product Manger
new blog image header
Technical

How Nvidia Dominates the HuggingFace Leaderboards in This Key Metric

A technical deep-dive into Token Duration Transducers (TDT) — the frame-skipping architecture behind Nvidia's Parakeet models. Covers inference mechanics, training with forward-backward algorithm, and how TDT achieves up to 2.82x faster decoding than standard RNN-T.

Oliver Parish
Oliver Parish Machine Learning Engineer
[alt: Healthcare professionals in scrubs and lab coats walk briskly down a hospital corridor. A nurse uses a tablet while others carry patient charts and attend to a gurney. The setting conveys a busy, clinical environment focused on patient care.]
Use Cases

Why AI-native EHR platforms will treat speech as core infrastructure in 2026

As clinical workflows become automated and AI-driven, real-time speech is shifting from a transcription feature to the foundational intelligence layer inside modern EHR systems.

Vamsi Edara
Vamsi EdaraFounder and CEO, Edvak EHR
[alt: Logos of Speechmatics and Edvak are displayed side by side, interconnected by a stylized x symbol. The background features soft, wavy lines in light blue, creating a modern and tech-focused aesthetic.]
Company

One word changes everything: Speechmatics and Edvak EHR partner to make voice AI safe for clinical automation at scale

Turning real-time clinical speech into trusted, EHR-native automation.

Speechmatics
SpeechmaticsEditorial Team
[alt: Concentric circles radiate outward from a central orange icon with a white Speechmatics logo. The background is dark blue, enhancing the orange glow. A thin green line runs horizontally across the lower part of the image.]
Technical

Speed you can trust: The STT metrics that matter for voice agents

What “fast” actually means for voice agents — and why Pipecat’s TTFS + semantic accuracy is the clearest benchmark we’ve seen.

Archie McMullan
Archie McMullanSpeechmatics Graduate