Artificial intelligence systems aren’t born smart. They have to be taught and trained. If you’ve got an iPhone, you might remember training Siri by repeating a few phrases, like “Hey Siri, it’s me”. That’s how it learns your voice – accurately. I believe Amazon’s Alexa has a similar feature.
But these guys have a head start: a huge set of training data. For the most part, your system already knows how to hear and understand you. That’s because big companies like Apple and Amazon have the resources to run a zillion samples through their speech recognition system and teach it.
Now consider an independent AI assistant, one that doesn’t spy on you or use your data for marketing purposes. That would be Mycroft, which I’ve written about before. I don’t have one of their standalone devices, but I do have the KDE desktop widget up and running. It’s not perfect, but it’s free, open source, and it does indeed work.
Here’s a clip of it in action. You can’t hear me, but you’ll see it processing and hear it answering.
It’s not perfect by any means, but considering it’s built by volunteers with donated money, I think it’s pretty impressive.
They’ve got a new project underway right now: training a better voice recognition system. A screenshot of it is at the top of this post.
What they’ve done is gotten the community to submit a collection of training data – people from around the world waking up Mycroft by saying “Hey, Mycroft.” The rest of us get to listen to the samples and grade the clarity of the recording. If you definitely hear “Hey, Mycroft”, you give that sample a thumbs-up. If it’s murky or unclear, you tag it as a maybe. And of course, if it’s just background chatter or noise, you flag it as a negative.
All this (anonymous) data is used to teach the AI what “Hey, Mycroft” sounds like. The community provided the samples, now the community is helping to teach the AI.
It’s really pretty cool. Although I admit my bias and fondness for the KDE ecosystem, Mycroft still has catching up to do. It’s not nearly as responsive as Alexa or Siri, but it simply can’t match Amazon or Apple or Google in terms of development resources. And so I pitch in. I’ll take a half hour to listen to samples and grade them. I’ll fiddle about with the desktop widget and report bugs or issues.
Because it’s open source, I might even dream up and program a skill for it.
After I do everything else on my list, naturally.