CakePHP Voice Search, Select, and Synthesis | CakePHP blog

5 min readNov 11, 2020

Have you ever wondered how to add voice input to your various forms, and fields? If you did, you probably need a new hobby away from screens, but lucky for you I also need a new hobby that isn’t rotting my retinas! In this week’s edition of Tuesday Tutorials, I’ll walk you through how to add voice input! I’ll also have free template elements you can use in any project you want. Enough with the introduction already, let’s get into the code.

Search With Voice

Searching with voice is the most common use of voice input. We’ll see some other uses later on in this article, but I figured why not start with the most popular!

Currently, speech input is not supported in IE, or edge so we will need to do a check lest we riddle our poor user’s console with errors. We will start our Voice search in JavaScript.

if (!('webkitSpeechRecognition' in window) &&  !('SpeechRecognition' in window)) {
	alert("Consider upgrading your browser");
	//Hide mic button
	//...
} else {
	var SpeechRecognition = SpeechRecognition || webkitSpeechRecognition;
	var recognition = new SpeechRecognition();
  recognition.continuous = false;
  recognition.interimResults = false;

  recognition.onstart = function() { ... }
  recognition.onresult = function(event) { ... }
  recognition.onerror = function(event) { ... }
  recognition.onend = function() { ... }

SpeechRecognition is FireFox's speech recognition library, and webkitSpeechRecognition is Chromes.

The continuous option indicates whether the speech recognition will stop after the user has stopped talking. If you set this to true you can continue getting user speech until they manually stop it.

The interimResults option indicates whether the voice recognition engine will display its first guess before it solidifies the results.

onstart, onresult, onerrorand onend are all events you can use. You can find Google's official demo here.

Let’s add a search bar, and see this thing in action! By the way, if you’re interested in making your website more interactive why not check out how to add some AJAX notifications.

<form id="search" method="post" action="/do/something">
  <div class="speech">
    <input type="text" name="q" id="transcript" placeholder="Speak" />
    <img onclick="voiceSearch()" src="//i.imgur.com/cHidSVu.gif" />
  </div>
</form>

<script>
	function voiceSearch() {
		if (!('webkitSpeechRecognition' in window) &&  !('SpeechRecognition' in window)) {
			alert("Consider upgrading your browser");
		} else {
			var SpeechRecognition = SpeechRecognition || webkitSpeechRecognition;
      var recognition = new SpeechRecognition();

      recognition.continuous = false;
      recognition.interimResults = false;
			
			//Start listening
      recognition.start();
			
			//Once user stops talking
      recognition.onresult = function(e) {
        document.getElementById('transcript').value
                                 = e.results[0][0].transcript;
        recognition.stop();
        document.getElementById('search').submit();
      };
			
			//If there is an error
      recognition.onerror = function(e) {
        recognition.stop();
      }
    }
  }
</script>

This will give you a search box, with a clickable microphone that will accept voice input!

This is what the speech button looks like.

What if we want to select an option from a dropdown with our voice? Turns out that’s not too difficult either!

Option Select With Voice

The main difference when selecting an option is listing the finite choices the user may select from. You can do that using the JSpeech Grammar Format. For example, if we wanted to select fruit from a list it would look like this:

var fruit= ['Apple', 'Orange', 'Grape', ...];
var grammar = '#JSGF V1.0; grammar fruits; public <fruit> = ' + fruit.join(' | ') + ' ;'

Our options are defined in the array and appended to the JSpeech header.

Now we can assign our list to the speech recognizer.

//Get classes for Firefox || Chrome
var SpeechRecognition = SpeechRecognition || webkitSpeechRecognition
var SpeechGrammarList = SpeechGrammarList || webkitSpeechGrammarList
var SpeechRecognitionEvent = SpeechRecognitionEvent || webkitSpeechRecognitionEvent

var recognition = new SpeechRecognition();
var speechRecognitionList = new SpeechGrammarList();

//Assign grammar list to speech recognition list
speechRecognitionList.addFromString(grammar, 1);

//List of options
recognition.grammars = speechRecognitionList;
recognition.continuous = false;
recognition.lang = 'en-US';
recognition.interimResults = false;
recognition.maxAlternatives = 1;nterimResults = false; recognition.maxAlternatives = 1;

For the options continuous, and interimResults are the same as last time.

grammars is the list of values we created earlier.

lang is the default language for the voice recognizer to listen for. It defaults to the HTML lang attribute value.

maxAlternatives sets the number of alternative matches to user speech.

Now that we understand the basics we can start the voice input!

<h1>Speech color changer</h1>
<p>Click here, then say a fruit!</p>
<div>
  <p class="output"><em>-- No Fruit Selected --</em></p>
</div>

<script type="text/javascript">
	//Voice recognizer setup
	//Get classes for Firefox || Chrome
	var SpeechRecognition = SpeechRecognition || webkitSpeechRecognition;
	var SpeechGrammarList = SpeechGrammarList || webkitSpeechGrammarList;
	var SpeechRecognitionEvent = SpeechRecognitionEvent || webkitSpeechRecognitionEvent;
	
	var recognition = new SpeechRecognition();
	var speechRecognitionList = new SpeechGrammarList();
	
	//Assign grammar list to speech recognition list
	speechRecognitionList.addFromString(grammar, 1);
	
	//List of options
	recognition.grammars = speechRecognitionList;
	recognition.continuous = false;
	recognition.lang = 'en-US';
	recognition.interimResults = false;
	recognition.maxAlternatives = 1;

	//Start listening on click
	document.body.onclick = function() {
	  recognition.start();
	}
	
	var output = document.querySelector('.output');
	//When user is finished print fruit
	recognition.onresult = function(event) {
	  var fruit= event.results[0][0].transcript;
	  diagnostic.textContent = 'Result received: ' + fruit + '.';
	  output.innerText = fruit;
	  console.log('Confidence: ' + event.results[0][0].confidence);
	}
	
	//Stop listening
	recognition.onspeechend = function() {
	  recognition.stop();
	}

	//Error handling: No match, or other error
	recognition.onnomatch = function(event) {
	  diagnostic.textContent = "I've never heard of that fruit!";
	}

	recognition.onerror = function(event) {
	  diagnostic.textContent = 'Error occurred in recognition: ' + event.error;
	}
</script>

What if we don’t want user input, but rather some more interesting output? Well, I have you covered there as well! Let’s see how to make our websites talk!

Speech Synthesis

Getting our website to talk to us is actually quite simple!

var msg = new SpeechSynthesisUtterance('Hello World'); window.speechSynthesis.speak(msg);

Of course, there are more options. You can alter the pitch, speed, language, and even the voice!

Conclusion

Would you be interested in some free CakePHP elements like voice search bars with browser detection? Why not comment, or subscribe and let me know!

You can subscribe, and comment below. I look forward to hearing your thoughts. Does anybody else feel like the bot from Her is much more possible now? Well if you decide to go public with your bot, why not host them cheaply?

Originally published at https://cakephp.blog on November 11, 2020.

CakePHP Voice Search, Select, and Synthesis | CakePHP blog

Search With Voice

Option Select With Voice

Speech Synthesis

Conclusion

Written by Christian Gonzalez | cakephp.blog