The Art of Sound in Motion – Greg Beller, IRCAM – Voice Tech Podcast ep.005

Greg Beller

Episode description

Greg Beller is the Head of the Interfaces Research and Creation team the leading audio research laboratory IRCAM in France. He is also the founder of SYNEKINE, a live entertainment company which mixes art and science in the spirit of research.

We explore the relationship between sound and physical space, and the link between our voices and our gestures. Greg explains what prosody is and its importance in speech and communication. He then demonstrates a number of technological art installations that can modulate the prosody of our speech to augment our capabilities. In the future, this technology will help us to build greater rapport and emotional connection with other humans and voice assistants.

Greg is a fascinating guest, who shows us how explorations into new forms of artistic expression can lead to new technologies for use in our daily lives.

Links from the show

Episode transcript

Click to expand

Powered by Google Cloud Speech-to-Text

Greg there the new smartphones will be the location into a confined spaces that would be the new Revolution and and that will be also the case for other questions welcome back to the voice tech podcast the most innovative discoveries are often found at the intersection of two Fields where to Heather two distinct Concepts or phenomena merged to form something new and exciting today will situate ourselves at the fault line between art and science and explore the relationship between our voices and the physical space that we inhabit this is episode 5 which means we’re already half way to double figures a little bit housekeeping first the number of website visits Twitter followers

and podcast downloads are steadily growing so thank you very much for that and thank you again for all the kind messages on Twitter as well in my last episode I asked for listeners to submit articles and put it recommendations to feature on the show one person did just that its Gavin from neat Box by Gavin makes Solutions that empower disabled people in their daily lives including our mobile app called welcome which is designed to improve the interactions between customer service teams and Disabled People it does this by making the staff in the year in the shop in the retail establishment aware of the specific needs of their visitors in advance of their arrival and so we had a bit of a conversation on Twitter and Gavin asked me to imagine telling my AI device that I want to visit shop and then I met at the door by a staff member who already knows my name and it’s been given overview of my disability

tier with tips on how to interact or what prepare for my arrival I think this is a great application of a voice changer that’s that’s just emerging nearly made me think of Google duplex which are not the right assistance in developing now given that demo soon we’ll be able to you know I just put these appointments in my calendar and the I will check the calendar call ahead to the shopping to magically give them all the information on the disability about us without us even needing to needing to tell them I think the Welcome app is a great first step towards that you can check out neatebox in the Welcome app at New Park. Comm that’s any 8 EE box. Comm and remember if you’ve got a blog article or product recommendation that related to voice in some way and send it to me by Twitter at voicetechpodcast.com at voicetechpodcast.com and of course if it fits the bill I share it with the 7000 people who are now following me on Twitter thank you also to everyone who left me five star reviews

iTunes I put the call out and you responded I really appreciate it a really really helps there’s a couple of nice reviews that I wanted to read to you the first ones from di Ferro that I met you last time it’s on the US side insight he says I’m the creator of say stream social network for voice we’ve needed a podcast like this one for a long time and I thank you so much for working on these high-quality episodes which keep me in the loop of what’s going on in the voice based industry I thank you very much do you fell for that also I said that I would give a shout out to the first review on the UK Canada and Australia stores are we got one on the UK store from Alibaba 6 he says I’ve been searching for a decent podcast That Focuses on voice to you for a long time and it’s finally here very interesting content and clearly presented I look forward to the next episode thanks car alright all thank you to please keep the reviews coming in is the most important thing you can do to support the podcast because it makes the show more discoverable with an iTunes which is the primary way people consume podcast these days though

maybe that’s it to change on a Google doing a big push these days and for the Google Play Store so hopefully we’re going to see more download through that on Android also we now on Spotify so if you’re a Spotify subscriber you can just login or even if you have a free a free Spotify use that you can search for The Voice tech podcast in the podcast section just search for it and it should pop out and also of course I’m stretched out and TuneIn radio and all the other places you can get a podcast today’s guest is Greg beller head of the interfaces research and creation team at if.com audio laboratory in Paris Greg is also the founder of cytokine alive entertainment company which makes is art and science in the Spirit of research it hosts artists from diverse backgrounds such as comedians dancers musicians Circus artist storytellers visual artists as well as scientific researchers and developers of new technologies together with this

wide range of collaborators and by applying the latest census in Stamford auction technology Greg produces artistic performances that lies somewhere in between Music Dance and Theatre and creates art installations that causes to question our perception of the world we treated to a number of demonstrations of great work that blend voice and movement for example he shows his how we can modify the property in our voices using our hands we explore the relationship between sound and physical space and the inherent biological link between our voices and I just is then we discussed the potential commercial applications for this technology such as building greater rapport an emotional connection between voice assistance and humans Greg is a fascinating guest who shows us how explorations into new forms of artistic expression Caleta new technologies for use in our daily lives so get ready to open your eyes and expand your mind as I bring you like Bella

ok sign here with Greg beller he’s the head of the interfaces research and creation team at it, which is the Institute for research and coordination in acoustics and music hair in Paris France and Greg is a specialist in the art and science of sound including voice speech and music you have a PhD in computer science and is conducted research into areas such as Express devotee emotion and mood gesture and movement as well as dance and space is also the artistic director of the synekine project which is going to tell us more about in a moment Greg hello hello much for taking the time to join us today so great could you tell us a little bit about your background where you’re from and they’re what led you to know where you up today

world of Friends so where I did my music are learning so proceeding piano and the service charge at the age of five and then I quickly moved to make music with a more pollution party like I like in the middle of improvisation and and torture and finally at the age of 13 14 I plugged a computer to my 13 over so I started to composer it’s ok Swan music and and there was also singing and rapping met up with some friends so I was there I was

MC LAN MC the local MP of all my friends so I was not so good in rapping he didn’t have nothing to say so so I’d rather making the music and the mistake that the word I was looking for was bringing the mixtape in the party and then but you made them you made the mixtape in and they rapped over the top of it is that right near the temple increase as long as making it was going on so I went to her house music and techno and try and send them or electronic music studio so you had a good setup so you’ve been involved in music production from a young age and you basically stuck with it through through your whole life is that right

are always close to where I music production never used my activities it’s a and so as though I kept a few months later developments sharing a few a few groups acoustic blues piano Guitar bass or seeing and also I really a falling into the electronic where are there in the computer to a radio program music and composers and I find it quite logically I went to work and which is the temple for mixed music so music in between using both of these means acoustic and electronic to to to compose music when did I just had a look on the

page and it actually describes exactly is that that the French Institute for science about music and sound and avant-garde electric electro acoustic alarm music so they go yeah yeah I did the The Prestige part is really important that you can BBC series of the decor of the day it questions a lot of a lot of fun I mean it address of questions that are really a musical and all the computer can share a musical experience with us a Performer and with the composer of the composer can compose with these these the computer as to costs and well as far as we we we we go now with artificial intelligence all the computer the computer can be ready in position inside the inside of an

Ronaldo the computer can I give a clever solution through to the nowadays that question addressed by cutting Edge computers absolutely am aware that there’s many many side projects ircam and I noticed you did a PhD in invoice emotion is that right what what made you change from music over to device there is no difference because I was already interested in the music of the speech actually end before making my PhD I didn’t Massachusetts on the music music music charity of the speech actually so I need me to today where are the president of way we speak and Doherty surgery define

when are universal and does kind of topics and well so I went out for those who don’t know could you describe what a prosody is so that when I say then the music of speech and the musicality of speech and I’m not talking about present you president everything from The Voice you get any information that is not in the words so it is it is a really striking that when we speak to each other for instance to wear to your other related actually you do if you write down what you say it you can compress a lot how hurt so what you been saying right now when we speak to others we repeat we go from one way then to another direction we have also lot of infections accent intonation

and this is because they do the speech is a suggestion and when we when we speech with point we pointed out some some of the words we are saying that we can really give the feeling of touching them or or bringing them into advice and and us customers were to look at it as soon as speech as a gesture when I say when I say that I can well you don’t see anything now but you can you can almost feel where is it outside right now I’m looking for your hands in their way of looking so what is pragmatic aspects and also the expression the emotions ordered

the speaker identity features like my gender my age my social origins my my 11 letter on my help with my state of fails if I have a decision out at all these information are brought by the sound of my voice of course it would she said physiologically related but also to the sound of the sentences and well you you can hear that im French and stents and it was so because I don’t have to do the right there isn’t the right way to earn every year old is information are actually inside the Prodigy that is accompanying the semantic message you are speak maximum and Recreation

if we see there’s a speech as a signals we can do find the president as a five-dimensional space Wisdom intonation which is related to the pitch of the voice and pitch caravan in collections you ever saw the intensity of the speech and then the speech rates duh duh duh duh duh duh speed of the English beat rate of the speech speaking rate and also the debris of articulation is whether if I’m not ovulating or if I articulate very well this is giving you information on the way I’m seeing things going past and also the voice quality and the voice quality find present ideas or if I’m speaking smooth with my closest then you can feel what’s the difference

any tips it is interesting to just and that structurally some languages are more it’s a Marks or a sensible or play more ways one of these diamond dimension across the I think of Chinese or Mandarin Chinese and Cantonese wedge boots in the same I’m going to have to type of pitches inflection you have the a semantic level pictures which is attached to the other desirable that the minute the meaning of the words that I should fix the meaning of the word related to the donation buttons in Chinese in sign languages but there is also infections that is part of the association and of the expression and but you’re talking about Chinese we can talk about Japan in Japan for instance the voice qualities much are you

Ender Ender the president as normal accentuation of the sentences and where are you are English so in the shower so loved using information to accentuate words whereas in French we use the speech rate so we will use longer syllables to do in German and say it’s good to know for French language learning and present I just need to know that in Europe because their friend Germany and English is not very far but we have complete different way of expressing outside steps in and then just just being aware of that fact makes you an understanding to the ways people is speaking as otherwise you’re just judging them the way

speak to you by your own set of criteria you have lots of Gujarat sounds from the dead produced in the throat and and this is for instance this can be associated to address a video for our friendship receiver yes I do you have to know you have to know this kind of compounds as I need to urinate Nantwich excellent giving us a good basis for what part of the is entirely like the things we’ll talk about in a moment will build on that

I wanted to ask you about these synekine project am I pronouncing it right synekine centre kind of kin or ok I’m gonna tell us what is the sign in projects following my my research in York on my PHD and this would suggest your best way of describing the speech I made I made a sensor size is a text to speech synthesizer where do you wear in acting the speech by the gestures so by the end movements you where actually are triggering Azam speech sounds ok so just to clarify that for the listeners cause I seen the videos and it is it’s a very visual thing and

is a text to speech engine first of all so we’re writing words into the computer and the computer will speak them but the way that their spoken depends on the movement of your hands as the music as the speech is playing is that right yes it is you you are making varying Jesus betrayed by triggering their the syllables and also the information about the same pictures of raichu moveset with the hands so I’m better anyway I mean that the project is about is correlation that exist between the and gesture and speech Chesters so I see this page as a set of gestures actually so this is it is a bit critical because in speech technology with your friend your friend here is a phonemes

fun user are just basically the equivalent of the alphabet in sounds right so let’s say there is a R and M and that’s that’s the way we will We Will usually think and bass or Mother’s thinking of them as if the speech where a projection of the texts also because we are looking for a text to speech translation and that makes things easier at roseburn directions to speech is not it is not a succession of fun is it is a succession of movement of articulatory movements that are labels we are to the Rings you labels all the time and it is simplified not with my end on the under the table Surat

speech I can easily segment it’s why speaking with my end because we are actually hardwired neurotically are ants and articulatory movements related to the speech the motor parts are related so it comes from ages where we wear a primate or even see is that our engineers are related to this picture we know that quite distinctively when we talk about Italian and when they speak a good example of the Italians talking with their hands

talk to me and is a sneaky person to come back to the user since you start from the young and you go you go to this device through the neurones or she looks for the grand old is going together and then adding a microphone and a picture and some investors so we can kind of close the loop and use the jet to play with annotation of the voice that is running and that is using the new technology so going back to the Syndicate Project and talking about the name briefly so you know about synesthesia say when there’s somebody is perceiving stimuli with another set of perceptive mean

I’m talking about people that for instance see colours when they are records seem that I can taste taste colours for instance or for instance as well so basically they associate to process for 13 and under the scenic Inn was there be the same music in year was the motor essentials and the ideas to to SE22 exactly speak about the this correlation between engineers and voice gestures and the first the next step was to think about all we can actually shop the speech in your time with one end a bit like I said

as I did with that knocking on the table and lol we can play with that was a Performer old is proforma can play shopping is voice and then we bring it back with another with the other end so we are we call that stuff the handsome peng mean you can just hear a little short to accept ok so here we see

a man in a room counting out beats with his hand now he seems to be doing a variety of kung fu moves with every movement of his hand he triggers of sound does a now he’s triggering with house of Frasers so that was a first attempt playing with the new technologies likes on aerial time sampling of speech and then we playing with the gestures in a segment of level meaning that we wear segmenting the the speech and triggering segments speech abuses Stewart a sequence of labels and sequence of actions but it still continues phenomenon the fureys

see you at the basis of the signal what we what we record is just a continuous phenomenon that is all the time changing ok so this was the video that we saw and whether the words are playing and each syllable of the the word is said by the computer when a a can of chopping hand gestures may yeah yeah so that was the first segment of that then yes so so these are the two new setup for performance has been out as instruments bringing the day before my new ways of interacting with his on The Voice and and the second one was going tomorrow and continues Association between the gesture and the and the speech was called Ayr gesture

wired gesture ok gesture and the idea was to get it is to actually do a gesture and Weezer with the antenna in the meantime with the voice and the computer make a model of this and then after when you play Back the dangerous to the sound is replay then stretched to follow the the timing of your hand gesture so you can place as slower the Sound by doing the gesture or lower case I just to confirm my understanding is I’m not sure I saw this one pro memory did is the litter to part and process where the first part is you record new record just a gesture do without the sound no no you do it with the sound you record both are ok and then when you repeat the gesture in playback mode but you can perform that jester any speed

and it will prevent it’ll play this play the sound at the speed you perform the gesture yes it’s affected or is there another control the speed and it’s time it’s been yes ok ok so let’s hear a bit then here comes wired gesture ok so first the guy in the room is recording his gesture by racing is on quickly is there a chord mode now he’s going to go into playback mode

I said he’s moving is on quickly through the air backwards and forwards and it’s plain the sounds at the relatives.

The second gesture

just play back with both Arms so when we did that we we we we we browse like in new and used pays for the performing because he could you know it’s a bit like a guitar is that is looking himself and playing sounds and then recording them and then playing and I’m digging around then and now it was the voice to place do some Justice associate some awesome sounds and replay the song is the jester engine to improvise on top of feet with his own voice right it is the saying this prospect than you take that

give him away to a romantic sense and it’s a man’s voice and to to become a choir himself as it’s almost like he added that the voice to his hands so that he can use his hands to to produce a record in the past and also control the way that that said using their the speed and movement of his hands or any other body parts pose with the hands and speak about or speak with or do a tree or do without that was his own and so that there was there was actually more interesting girls and me and the performer and so I mean

when I say ok for now I am not giving them because I have been working with several as you look in these technologies together with the people were using them we wear a more and more wanting to compose and to really been able to a structure a performance ways and the use it as a as a performing and so it turns that we we needed to arrange in Space the sounds meaning that doing Justice is ok if you can do gestures hand gestures in it’s a self related reference space what if you were bringing this gesture into a am absolutely sure that is also

friends for the public meaning in the space ok we had a Riad connect which is a motion capture your camera the Microsoft connect connect to absolute position into the theme to the the set-up ok just to clarify then this is moving from controlling sounds in relation to our own bodies to controlling sounds in relation to the space in which way we inhabit the sounds of fixed to exactly room or than you know the place that we are in is there exactly so we we also went from and sensors that were accidentally metres

that the performer when was wearing add onions to connect a camera based localisation then sore and is in this movie going from self related reference a space to take the space we share with our absolute references was also bringing the Dead to set up the bedside instruments from from a level of oneself playing with his own just talk to people playing together with the collective approaches so that means that has no no the performer in these new instruments that is called the Sound space the performer is able to place is

wisdens celeriac I’m doing sentences with phrases so I’m saying hello, for instance and in the meantime I’m doing a gesture that is more or less saying hello or if if I wear brushing the space and I’m saying hello car and then with my hands when I come back to the the position where I left the sound I can play it back and I paid back in a straight way meaning that if I’m doing the same gestures in the same place I can with more or less the same timing I can do a car like this if I’m stopping at all for his dad so I can do the reversed

I’m doing the jester and yet the other way around like you’re only and does this kind of stuff so this town is really pain by the performer into the space and then the sound is attached to the spaces and 1:45 x y z position and then the performer can can move and somebody come come and play back this sounds and play with it as well and so that means that the trial am I right in thinking that the sound is only triggered when the correct yet when the correct address chair is made as well as the same as as well as the right decision being occupied it’s not enough for my hand just be in that position but I need to be doing you no need to be straight to my fingers out for instance no you don’t need to do the gesture back to you you can really address Tuesday twice a position so whatever just do it will be making you you will you reply back as soon as soon as you

put your hands on the Swayze position so it tomorrow in the recording face you you really have to do a gesture because you can be studied in one position so you have to move through space to spread the sound of a say ok so so so that you can you play The Defamation of Strickland vet including Theory just take a snapshot of your body posture from stands and just and record just a space that you’re occupying as the space whether the sound is played or do you have to do do they have to be an element of movement for a to go to track their to associate the movement with the time time I spoke to the sound what we do is we catch the spirit of the performers so we know where are the ends and we just focus on the ends actually we give up the whole body and we just are so we have two dots not saying now interfaces which is the right time

left and right and there is a gesture to say I want to I want to record or I want to stop recording and then after the before I just have to do this just you and start to move one and and and and speak or or a singer and then the sound is there is like a trace which can be thought as a memory trace which is the ins position the end two of us the sounds associated 211 piece of the sound and so if I do the same gesture or not unless with the same timing meaning not the same gesture but if I go from like the same position I will I will let you get the sound back and if I’m

ring whatever dance inside this is this a cloud of a voice I will start to make it appearing you can also new articulations or new type of Sounds new reorganization of the sound which is also very interesting for dancers for instance so we’ve been making a performance was a dancer is Valencia James that was really interesting if you did because he was there creating some characters together Wisdom that the body gesture the first years but also is the sound and the voice and then she was like coming back in force at the places where she left characters and re-enact them and also re-enacting the sound she produced before a wonderful are so we called that

the memory Palace because it has summer by the memory Palace is a long story as well but this is interested a memory technique 2/2 memorize so you place to hear memories inside a imaginary space like a pile and then you can come back to your memory’s going through the rooms of the of this place and yeah we kind of like Sarah a memory Palace in front of the audience together with the audience by creating and placing the sound in the space and then coming back and and improvising with these pieces of advice is under stage other soon as a great and very tangible way to demonstrate the memory memory Palace concept is a hard thing to it

anything you have to say so imagine a place that you know where personal personally to you and then place these objects within that space to associate the object with the with the location but you’ve done it with sound you can say our in a Palace and we’re placing a vase on this on this table and then you could you could play the sound of a vast as being being pinned or I don’t know play a record on the gramophone and then you could hear it when you occupied that space and I have you got a personal storage or data material as a piece of memories like our photos of our males are aware of their organisation as they are actually physically spreading to this page using r d

guitar USB keys or even the cloud with no more less where is it so so it’s it’s it’s really interesting me now in the the the the perspective of the new technologies visible care assistants for instance to think about having these interaction with our memory using the space and because it is also a we are actually memorising stuff the best it can be very interesting to have to do to me I’m into it to really use the space and using location sensors I couldn’t agree more actually want one example that comes to mind is taking you’re taking medication or taking vitamin supplements if you do that there’s all sorts of apps on the mar

now to remind you to do that he known also different ways but I think the best way for many people is just to put your medication on a table where you off and set so you see it it’s just a location-based reminder and I can definitely see this technology being used in many major phone applications such as that say when you were physically in a location and auditory a lot would be triggered or your voice assistant would say they want you take medicine right now because you’re you’re in the place where you need to take it you know that it’s right there yeah so we can be of course as far as she’s seen it is seen that exciting during so at home for instance like that say I’m going to the fridge and my mum just let me a vocal message saying are you a boy you need to buy some milk and then when I just want to open the fridge welding

is knowing that I’m in the area of the fridge and just like thing to me or maybe you it would be when I will go out that will hear this voice back casing AA remember you have to buy some. I forgot to take her keys and yes I am crossing the streets and whether the sensor rossini calculate the speed and see that there is a car that is maybe the lamp can you talk to me and say a man just just wait a sec so of course this centre signaletique that is associated to associating voice in space will be also using the commercial so for now when we are in the

airports you know this watch your step sing when you you go swimming elevator that it will be associated advertising market Centre and commercials everybody’s in a dystopian future yeah I’m hoping about a more optimistic view would be that it will be directly connected to the voice assistant in your ear that he knows that it’s providing you personalized assistance you know around the airport or wherever you are supposed to just flogging Easter we are not so much use of having a voice that is in your head all the time and speaking to us interacting with us I mean we are use with yourself of course but we’re it it it might be a bit disturbing to accept that there is a third voice like head voice that is

the time speaking to us where is speaking to voices in the around the space and everywhere this we are very used to because I must have citizens people they they they just are you know when cities are surrounded by people when talking with a man and his arms and talking in so I think that we are most used to interact with voices that are associated to space or two situations character or it is a bit familiar for us to have his voice that is all the time on a resume we do have the stream of consciousness as talking to ourselves as we go about our daily business but that’s that’s us talking to ourselves as opposed to something else talking to us

the address of wasted will have to wait to see what applications this is actually have to come about what you know how people react to them the other I think that the whole voice associated with location in space idea as so many applications and another one that comes to mind as training just being able to trigger sounds you know when someone’s in a particular position in the correct position or in the wrong position I might and answers you know athlete some people prefer there is a famous graphical Lebanese position into the space and when you studied is the position it’s really hard to to make them properly and I’m looking at you in the mirror depends on what flight is so if you cannot

Judas position of Sound instead of having a visual feedback that could be ready and full for themselves due to an imagine it being used alongside the division as well I sort out your recently that the credit some kind of missed in the air and illuminated parts of the mist reduce you know any any image in 3D that you wanted but I can imagine that you know that being projected into the middle of the Dance Floor the dancer align themselves as best they can use in their eyes but obviously can’t look at your whole body at that you know that the same time so the audio would then activate at the same time to meet you in that perfect position you get that but nice colours down the six SBC women like are there are the Seas of you losing the river referential space and you just fall down sometimes you know

are there are people who lose lose their balance mean regarding nation and we are we learning of some basic basic activities like a speaking and moving an accident or something as well as good as a recovering basic aspects of communication and I’m talking about Tony merry learning movements as well the other aspects which shows a touchdown was the voice associated with the natural gestures and the movement and so there’s voice so che tu with location in space but there’s also voice associated with that the actual move our bodies and a member for the the podcast we talked a bit about this and I noticed so there’s two possible ways this could be you

we can adjust the assistant voice if we’re talking to her you know anybody de Janeiro voice assistant yeah we could adjust the assistant voice based on the user’s location of postural in gesture it preactor how you’re moving your mood and maybe on moving slowly maybe you’re very energetic maybe you’re just circulating and it can regulate talk to you or we can have some kind of you know human augmentation system which actually adjusts the person’s voice based on their own gestures all their past your own etc and I find both of those aspects really fascinating there there two separate applications but that would be interested to hear your thoughts on on those as well yeah I will talk about the singer on first because there is also in the Syndicate Project is a body choir think so the idea was to use the t

when is when when they see her saying he is making expressing jester as well I can empathise or even the vibrator or the accusation was the end so we’ve been working with with singers to announce their voices and apply sand to a musical scale so as to make them singing as a choir so other voices harmonize and the way to my house in the number of Voices anger some parameters of The Choir digested by The Justice and in this respect within these instruments we feel like singing was in now producing gestures so as we can try to actually nuts not ask the performer to do specific gesture but just to catch

the expressivity and an augmented that’s a wizard is this choir so so very simple moves like if you raise me and then you have like more voices and if she wish you were enlarged that your ends then you have like a give more space in between your hair and you have more volume are there any interesting we we we we talk about the parameters of the Koala with shouted as a ball like a convertible that the singer is playing with and that is defined. Define by the position of the two ions and ended the planning of the the rear of the volume of sweet and the disposition of the place in the space and we

we we will take decision and the level of the electric wire that is harmonizing the voice in your time are you really really have to see that you really have to see the video to do it so yeah maybe we can be a bit except if you are we can we can playback

ok so this is a guy in a room with his hands in front of the amount stretch just a single guy looks like he’s actually producing sounds with his mouth while moving his hands closer and far apart from each other it’s got beautiful actually the sound at least he seems to be enjoying himself as well so the movement of his arms controlling the pitch seems and the number of Voices singing perhaps

other tone of the singing at least I periodically he’s adding new sounds with his voice it’s not singing all the time now is really at stretching his arms out

ok and so that that’s for the 744 the first one to this is the day announcements I mean it is a case where we the computer is listening to the expression and is using these parameters to enhance or to a woman’s this expression ok so this is very interesting because this is what what will make a vocal assistance that are now more and more present in our asses as a companion website is not showing of companion is very important for me and I think this was this will be the success of the vocal assistance technology is not a need

this guy is able to understand you but is also able to create a shared experience is all about developing rapport with the with the letter when when we see anything else about me making actually so when we see your children in school, and they are best friend ever and they are saying the same words the same time with the same donation and and and the same they really my meeting a lot and great example going to turn it down a bit or disguise it in some way or a bit more subtle but we still have a short I have a little twins at home so I can really tell you that they are really my meeting and I think he’s my Mickey

aspect is really important for the Rookery assistant so they look as she stands not only something that is a corporate voice that is answering all your Desire it has to be a Vauxhall companion daddy’s learning from you and also teaching you some new way of saying things and what are the 20 words and create new words and you know it going to the level of creativity of the in the language of Liars to stimulate and survives weekend off like feel a friendship wizards and their comprehension beyond the level of the semantic where that is also present already present with the computers you know Siri voice assistant say no talking talking about more Street you know dropping their teeth in their stuff in their special that you just say

so this is that the next step of the day it’s a there any chance a lot to do with property and interaction and a quarter hours until to mimic the property is listed to marry the person you should try the request so as to require the end and the other is acting that the book an assistant is could be spread in the house and not are in the world and not like I sent it it has to be to Aylsham High then you wait to see you just remove manchurian you in the space and perceive you as a human being in a wheelchair with all your or you’re a US patient means that I can also do investors and the end

I know have to identify you as you there being many people in the household and I think that the big players already working on that the identification of the user through their voice but a lot of this this just a search stuff going back to you know playing on their spacing and movement and they’re going to have to incorporate Simpson new census although having said that I did discover a couple of products on the market that they can detect gesturing and proximity just a sound which I was surprised because you imagine it happening only through a camera so you think while these we’re going to stick a camera on every voice assistant which kind of defeats the point of it being at you know I actually know the company and they they produce are ultrasound presence detection gesture tracking is proximity sensor using the existing speaker and microphone on smart devices on mobile phones

89 production today which is incredibly so you know these manufacturers I’m not sure I think it’s it’s more involved than just an app that you install whatever you can just install an app and start detecting you’re just as or maybe maybe I maybe you can or maybe I’m wrong but I get the impression that this is something that manufacturers can adopt an offer gesture detection in there are no existing devices and is and is also another one that is also many offerings that offer the same kind of functionality but with cameras by think that the ultrasound thing although it is very short range is quite interesting yeah yeah it’s a lot of well this is the domain I think it was a breakthrough in the the new smartphones will be the location to confined spaces the events that will be the new Revolution I wear and and that will be also the case for other questions I think I accidentally on the sun

this natural conversation creating rapport having a Drayton an emotional connection that is all the rage now I think that there’s already a you know a great number of emotion detection companies but now with the the Google duplex demo we’ve seen computers interact using voice only with humans are on a realistic you know believable level house how staged it was we don’t know but it was pretty dramatic demo and so I think this stuff is is going to be coming thick and fast now and I’m very confident as as well so because we as a human adapt a lot to speak to the machine I mean so far away new want to use Siri or so you have to dictate actually is dictated detecting something we are not used to that goes back to her first school time and we better we are very keen in doing to the machine because we know that

the hardest level of Childhood lead singer just standing and we are very I think also that the auto tuner evolution during the day is Citigo aspect of The Voice is Robert Dyas The Voice in the market is is really showing that we As Human already fascinated by audiomachine speaks and we want to speak with it into and we are able to watch elite you repeat three times the same thing to to make Siri understand much so I think they’re and there will be more than One Direction that’s what I what I suggest is that these guys doesn’t really feel like having a proper social it’s a decision with booking system locations to be very weird and

V&A presents I would like to come back at home now and instead of having a TV on or a radio I would like that Amazon and April are talking together in my in my room and that’s pretty some noises and they are seeing like a picking up everywhere they making playing with them and reporting them and they’re giving me the definition of the time speaking speaking speaking so as soon as I can come back to natural interaction and say shut up and I don’t want to call on you know I don’t want to say OK Google because this is not something I do with people in I’m not saying a horse riding I’m saying hey man you know what kind of thing and I just want to know I just want to say to the dead guy that is speaking speaking with a comedown don’t speak now let me give me a break ya see you talking about voice assistance though

and other listening to you passively making your own suggestions interrupting talking over you but then something with you’re not listening to me are acceptable it said now they really have to listen to you all as well when they are speaking and I’ll tell you when they have finished speaking or just end up seeing how they actually have to learn the art of conversation because they’re going to be in gen themselves into human interactions and if they’re awkward and wanted them they going to incur the Wrath of of people who just don’t want them involve and I noted for some people listening I going to be imagining this version of the future and thinking wow that’s just an absolute nightmare no like Google and Amazon and cereal in a room talking to us all the time and tell him to shut up by version of that are useful power down version of that well I know the most relevant content delivered in there in a voice that you customized yourself that you know you just love the sound

no talking directly to your needs and I’m giving you no more you know value everyday I think that’s something that we have people will adapt to and and embrace overtime yeah that receives but yeah I would be happy if I like when I start to sing and my shower are there are some voices in the house that he is just that there are also like at making a short gig with me and without me calling on then another type of body quiet videos not alone by the yeah ok yeah I just I actually came across the news about the news article today’s top yesterday about Siri the new beta version of Siri includes a shortcut could raise to talk which is you can actually just lift the app

Apple watch app and speak to it directly without the the wake word without saying hey Siri or whatever so I thought well brilliant is coming is already started got a jester a gesture wake words that have to go on your wrist but soon I think they’ll have gays activation we can just look at it and then one day it will just know that you say I’ll be listening to what you saying or no just through the intonation of a dressing it be without you that hurt to look at it that you’re talking to your voice assistance but I know you’re more focused on their the and the artistic side of things and how it can benefit performance I appreciate you coming coming with us on their on the whole commercial journey didn’t know I wanted to ask you like how does how does art influence technological development in your in your opinion because you spend a lot of time developing new tools for artists but then you’ll

see that technology as we aren’t as we talking about right now with seeing that tells you that you that your championing and spearheading that come back into the public sphere to M2 no influence the things that we use day today and I like the Mongols and toxicology so briefly I can talk to you about the story of February composer will I was working on these voices is a software and he came to me and say ok I want to use this voice in my show when I want to let Weoley said that that is the machine there is a comfortable and so I was like working hard to produce a lot of sentences and I was listening to them carefully to to detect each of the the the the artefact in Italy duh duh duh duh duh duh duh duh duh was a pretty good so it didn’t make lots of fatty

so I was travelling in listening to a lot of I like audio books in the sides and just to track ID Aware of the machine and it just came and say ok no you know I want to to make Arrows actually I want to listen to A Rose mean so the other day artists came in and really changed my mind and changed my vision of the problem so we start to study what would be the problem the hours of the machine so we were we were doing like a list of what was the the artefact of this is Aziz like when when you speak so without breath or when it’s peak wizard with information too fast or too slow or just kind of the amount of time anyway and we pushed the limits of the sensitizer to produce desires and to listen to them and then after a 20 makes me a set of of to do things to to to to to to to to make

there’s a better way the artist King and his Vision and with his days away around the journey he came and say I want to see the problem reverse upside down and it just let me to a new new version of my my own concerns and so I would say that so that’s why I’m really tired now to where I’m bringing art into industry and Innovation because this is the four major artists and scientists are the same as just as a compliment of revisions that can really improve fastest in a fast way the innovation that’s brilliant thanks very much great and so what will you be working on over the next 60

months until I’m trying to develop to get a singing synthesizer things together with reception area manager comments would you like to achieve a more radio interface to program the the singing gas soldering sensitiser this is entirely artificial artificially generated singing is that right and then there so we’re trying to make a nice interface for that and also still going still working on the Syndicate Project to 20 block new where new technologies music The Grand gestures in the voice to bring the phone was sent you instruments on stage and the new experience for the pubic wonderful wonderful when I encourage everybody to to go to the

website great websites to check out all the videos of the sonic and projects and it’s really fantastic to see and where can people find out about you won’t one of the websites they have these don’t have your videos on well there’s two website many my my personal website ga ga billyoh.com sky.com and there is a website of the Syndicate Project that is in French Connection., s y n e a k i n e. Comm and you will find out the videos on YouTube as well that you can follow me on Twitter at Craigavon Coppelia fantastic ok thank you very much once again a few times I thank you for a working now it’s been great to talk to you and I’m glad to do a podcast I appreciate that you and I was more gas light years and it’s a sure thing yeah yeah

bye bye you just heard from Greg beller head of the interfaces research and creation team at a time in Paris Greg started by explaining what prosody is and its importance in speech and communication and then showed us a number of ways that speech prosody can be modified using technology that track my movements and I can imagine many applications for this so in terms of human augmentation I can see people who gives speeches as politicians and business people modulating their vocal tone in real time according to their hand gestures so that their physical movements are actually changing the sound of their voice making it more dominant persuasive seductive whatever they need but I can also see voice assistance tracking our movements in the future to better gauge on mood and environmental context and modifying the property of their speech to benefit

I need so for example if you look tired after a long day and I voice assistant uses a more gentle voice for instance where is it with dancing around getting ready for a party perhaps it shouts excitedly to match the energy in the room in any case I think it’s a foregone conclusion that body and gesture tracking a presence detection and mood recognition will be added to voice assistants in the near future the technology already exists and so it’s just a matter of integrating out and developing applications that leverage these new capabilities that’s all for today I hope you enjoyed the episode as always you can find the show notes with links to resources mentioned in the episode at voice tech podcast. Comm and do check out Greggs Siner King website at Sunday king.com that has all the YouTube videos of his performances and yeah they really have to be seen to be appreciated so do check it out you can also follow me on Twitter

voice tech calm and if you’d like to appear as a guest on the show or you just have an idea for a Podcast episode that you think would be cool I drop me a message on Twitter at voicetechpodcast.com at wojtek forecast.com just put the show you can just tell one friend or colleague about this episode or share a link on their social media also if you haven’t done so already then please head over to iTunes on stitcher and leave us a quick 5 star review the first one to get a review in iTunes in the Canada or Australia stores will get a shoutout on the podcast if you like to become a patron you can support the show for as little as $2 a month at patreon.com / voicetechpodcast.com of another episode until then I’ve been your house to call Robinson thank you for listening to the voice tech podcast

Subscribe to get future episodes

How to support the Voice Tech Podcast

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.