Quantcast
Channel: Raspberry Pi Forums
Viewing all articles
Browse latest Browse all 7503

Graphics, sound and multimedia • New speech to text models and library for the Pi

$
0
0
I'm a long-time Raspberry Pi fanboy (some of you might remember the Pico port of TFLite Micro) so I'm excited to share a new way to add voice interfaces to your Pi. I've just launched https://github.com/moonshine-ai/moonshine, a family of models that support streaming, so a lot of the work happens while the user is still talking, together with a library to run them. I've created a quickstart guide specifically for the Pi (https://github.com/moonshine-ai/moonshi ... spberry-pi) that uses our pre-built Python pip package that's been optimized for the system, along with a getting-started video (https://www.youtube.com/watch?v=NNcqx1wFxl0).

I'm pretty excited to see what people build with this, since we've been dogfooding it in-house on some fun projects that I hope to share soon. With a Pi 5 I've been able to get realtime performance on our largest 244-million parameter model, that achieves a 6.44% word error rate, compared to Whisper Large v3 that has 1.5 billion parameters and a 7.44% WER, though I recommend active cooling since it is pushing the limits of the SoC. We also have a 123-million parameter model that achieves a 7.84% WER and runs with room to spare on a stock Pi 5.

Statistics: Posted by petewarden — Fri Feb 13, 2026 4:03 pm — Replies 0 — Views 42



Viewing all articles
Browse latest Browse all 7503

Trending Articles