If you don’t know about it yet, the HTML5 web speech api
specification is now in a working condition on google chrome and partially in apple safari browser(See the browser support status here:
http://caniuse.com/web-speech).
That means, you can now develop voice driven web applications. We can
hope that other browsers will start supporting this very soon as well.
In this tutorial, I will try to explain how we can start developing
application that uses this and also refer you to a small wrapper library
with easy to use abstraction that I wrote recently.
Voice To Text API:
We can alternatively mention it as ‘Speech Recognition API’ as well.
What it does, is to capture user’s voice through input system and
convert it to text. So, basically there is a need of voice recognition
technology here. This feature is currently supported only in Google
Chrome browser. By default, it uses google’s own voice recognition
service. Here is a code example to implement it:
01
02
03
04
05
06
07
08
09
10
11
|
var recognizer = new webkitSpeechRecognition();
recognizer.lang = "en" ;
recognizer.onresult = function (event) {
if (event.results.length > 0) {
var result = event.results[event.results.length-1];
if (result.isFinal) {
console.log(result[0].transcript);
}
}
};
recognizer.start();
|
The above code snippet will request permission from user to allow
taking input through microphone access and then will capture the sound
you talk, send it to external service for recognition, and get the
result back inside ‘onresult’ event handler. Thus, we will be able to
see the output in console window of browser.
This class definition also exposes an optional ‘serviceURI’ property
which you can use to define the service url you like to use for voice
recognition.
Text To Voice API:
Text to voice conversion is just a simple way to play a given text in robotic voice. Here is a simple code snippet for this:
1
2
3
4
|
var su = new SpeechSynthesisUtterance();
su.lang = "en" ;
su.text = "Hello World" ;
speechSynthesis.speak(su);
|
As you can see, its pretty much straight forward. Where we just need
to pass the desired language and text to it and its all set to play it.
Checkout The HTML5 Web Speech Demo!
The Wrapper Library And Usage Example:
As I already experienced, some of the steps are obvious and can be
made simple with a simple wrapper. Thus, to make your life easier, I
started a small JavaScript library to ease the use of this api. Here is
the github link:
https://github.com/ranacseruet/webspeech
It’s also registered as a bower package too! So, if you are using
bower for front end package management, you can simply run command like:
And you should be just fine!
Here is a very simple to use example:
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
|
< input id = "text" >
< button onclick = "talk()" >Talk It!</ button >
< button onclick = "listen()" >Voice</ button >
< script src = "../bower_components/platform/platform.js" ></ script >
< script src = "../src/webspeech.js" ></ script >
< script >
var speaker = new RobotSpeaker();
var listener = new AudioListener();
function talk() {
speaker.speak("en", document.getElementById("text").value);
}
function listen() {
listener.listen("en", function(text) {
document.getElementById("text").value = text;
});
}
</ script >
|
Final Words:
Even on in chrome for windows, there is another issue, which is like
it doesn’t support capturing voice continuously, instead, you will have
to allow it on browser every time you want to say something. However,
there is a work around to get rid of this annoying access allowance,
which is to host your application on SSL. Then, only one time access
will work for all later times.
Hope this simple tutorial will help you to get started with HTML5 web
speech API easily without much difficulty. In case you are facing some
issues, please let me know via commenting. Happy coding
Comments
Post a Comment