Home » Articles » The Rise of Speech Recognition Technology in Gaming

The Rise of Speech Recognition Technology in Gaming

Jun 17, 2019

| Paula Daniels

The Rise Of Speech Recognition Technology In Gaming

Speech recognition has come a long way since Bell Labs’ Audrey, a program capable of recognizing spoken digits, released in 1952. IBM Shoebox followed a decade later, and it was able to understand 16 words. In 1971 came Carnegie Mellon University’s Harpy, which could comprehend 1,011 words. In 2008, Google brought speech recognition to mobile devices through its voice search app. Three years later, Apple introduced Siri, a voice-enabled digital assistant. Since then, Microsoft’s Cortana and Amazon’s Alexa have come along, signaling a brighter future for speech technology.

This same technology has made its way to the world of gaming. In fact, audEERING’s Christopher Oates discussed speech in gaming on Voice Tech’s 20th episode. He gives an example of audEERING using voice as a controller, as well as the possibility of an online game requiring players to “give a rousing speech” before sending their troops to “battle.” These examples demonstrate how the technology has advanced leaps and bounds, with improvements being made every year. This means that a future where speech recognition becomes an inextricable part of gaming isn’t out of the question. That being said, let’s take a closer look at the rise of speech recognition technology in gaming.

A forgettable beginning

Speech Tech Mag shares in ‘The Game Changer in Gaming: Voice Recognition Technology’ that in the early 1980s, RDI Video Systems developed the first voice recognition-enabled game console in Halcyon. Only 12 units were made, though, and just two games were released (Thayer’s Quest and NFL Football LA Raiders vs SD ). But Halcyon was unable to follow through on its promise. To be fair, voice recognition was in its infancy back then; challenges like voice processing and recognizing speech variations weren’t possible.

Making headway

As voice recognition technology developed (with the release of Dragon NaturallySpeaking in 1997), so did its use for gaming. In 2000, Nintendo released Hey You, Pikachu! for the Nintendo 64. Players had to “communicate” with Pikachu, the main character, using a voice device. This communication was made possible by the game’s 256-word database. In 2006, Nintendo released Odama for the GameCube, a pinball game that mixed tactical wargaming and voice control. Players had to use a microphone to order some 10 or so commands, including retreating and flanking.

Another notable release was Ubisoft Shanghai’s Tom Clancy’s EndWar in 2008. It was a real-time tactics game playable via voice commands — specific phrases structured in a conversation tree. The common denominator of the three games was that all used scripted commands. Six years later, North Side Inc. released Bot Colony, where players “navigated” a world full of robots. Unlike Hey You, Pikachu!, Odama, and Tom Clancy’s EndWar, em>Bot Colony relied on natural language rather than pre-programmed commands.

Moving forward

Voice recognition is now being used in more complex games, like the VR-enabled Starship Commander, which allows players to converse with non-playing characters. The technology is even featuring in popular eSports games such as Overwatch and Call of Duty. In these games, the players are utilizing speech recognition as a means to communicate with one another. But it will be an even bigger achievement when the technology starts being used in the actual gameplay. eSports is one of the fastest growing forms of entertainment around the world, with its players, fans, and revenue increasing year-on-year. As of 2017, eSports had over 9,500 registered professional players, around 299 million viewers, and some 3,765 tournaments held all around the world. When speech recognition becomes widely adopted by other eSports games, it will certainly help its growth across the gaming industry as a whole.

The development of modern, open world game games is a challenge in itself, having to anticipate a multitude of scenarios and sequences for plot and character dialogue. Adding speech recognition to the equation presents an even greater challenge in accounting for global languages, inflection and voice variety, to name but a few. As technology advances, however, especially through the use of more complex AI, these challenges will soon be overcome.

Author

Paula Daniels
Blogger - Techie Doodlers
Musing about Tech News, Gadgets, Graphic & Web Design, Social Media, Tech Tips and Tutorials.
http://techiedoodlers.com/

gaming | multimodal | startup | stt | voice assistant

Search

Recent

AI Podcast Name Generator – The 10 Best Free Tools in 2025

Why use an AI podcast name generator? Choosing a good podcast name for your show is one of your most crucial decisions. Changing it later can be difficult, so you want to get it right the first time. This is where an AI podcast name generator can help. A podcast name...