Pablo Arias is a final-year PhD student in perception and cognitive science at the audio research lab, IRCAM, in Paris. We discuss Pablo’s work on the how smiling changes the voice, and how people perceive smiling and non-smiling voices.
First Pablo explains what cognitive science, neuroscience and perception are, and why research into these areas is so important. He then takes us through the aims, methods, and results of his latest research paper into smiling in the voice, and we discuss the academic and technological implications of his work.
Benjamin Etienne is a data scientist at Rogervoice, a mobile app that allows deaf and hard-of-hearing people to use the telephone. Ben shares his inspirational story about how he taught himself data science and machine learning in the evenings, so he could work in a more technical role. He tells us why he’s not keen on Kaggle competitions, and why getting a job in data science is the best way to master it.
Greg Beller is the Head of the Interfaces Research and Creation team the leading audio research laboratory IRCAM in France. He is also the founder of SYNEKINE, a live entertainment company which mixes art and science in the spirit of research.
We explore the relationship between sound and physical space, and the link between our voices and our gestures. Greg explains what prosody is and its importance in speech and communication.
In this episode I talk with Charles Cadbury, owner of the London-based technology consultancy, Champers Advisory, about his experience building voice applications, and the fascinating future of voice technology. He was great fun to talk to, and had plenty of surprising facts and interesting stories to share. You’re going to really enjoy listening to this episode!
Charles shares his extensive knowledge and experience on a wide range of topics related to voice. We cover the challenges when working with client data, how payment transactions can and will be handled over voice, how voice assistants will change the landscape of consumer sales and marketing, and much more.
This episode covers 8 of the most interesting voice startups that I found at the Vivatech technology conference in Paris, France.
Included in this episode is a voice transcription and synthesis mobile app for the hard of hearing, a voice enabled smart alarm clock that can monitor your sleep quality, a robot behaviour system that delivers CMS content in person, and a comprehensive voice assistant platform that can handle multiple requests in a single query.
John Fitzpatrick is the VP of Product & Engineering at Voysis, a leading voice technology company that builds custom Voice AI solutions for businesses. They are currently focused on the ecommerce vertical, helping to voice-enable mobile apps and websites to augment the shopping experience.
In our conversation we cover a range of topics including the major components of the Voysis system, the technologies and tools John’s team used to build it, and the challenges they faced. We also discuss how Voysis protects a user’s privacy and the implications of the GDPR regulation that comes into force soon.
Eric Bolo is the CTO of Batvoice Technologies, a speech analytics startup based in Paris, France. Eric talks about building a custom speech-to-text system for their flagship product, Call Watch.
He introduces us to speech analytics and audio-mining, and describes some typical applications. We go into detail about speech-to-text (STT) technologies, and discuss the pros and cons of using cloud STT services such as Google speech versus building a custom STT system yourself.