The problem with audio content marketing

Rumble Studio Audio Content Marketing
by Carl Robinson

Audio content is king

I love audio content. I’m a fan. An addict perhaps.

I listen to podcasts on my smartphone when I’m travelling on foot. I ask Deezer to “Play my flow” in the car. And all too often I ask Google to “Play the news” in the bathroom. And I’m not alone.

Some of the most successful voice apps are content driven. Music and news are some of the top uses of smart speakers, and sleep sounds apps have brought their developers great riches. Audio content apps collectively hold the title for ‘killer app’ on voice enabled platforms, for now at least.

As the founder and host of the Voice Tech Podcast, it goes without saying that I think it’s important for humans to be able to communicate naturally with computers. This applies equally to accessing both services and content. This is because I see huge improvements and potential benefits offered by voice-enabled, multimodal interfaces.


Accessibility improvements allow the elderly, children and disabled to enjoy the advantages offered by today’s IT systems in a full and independent fashion.

A great example of this came from my conversation with Bank Independent, who described how one of their customers with macular degeneration (loss of eyesight) was able to regain control of his finances by using their new Alexa skill for banking. The hands-free nature of voice interfaces is also beneficial for all users however, such as asking for on-demand audio content in the car using a system such as Audioburst, making journeys safer and more enjoyable.

Functionality improvements abound too.


I’m particularly excited about seeing more personalised content delivered through voice-enabled devices, possibly enabled by biometric authentication systems such as IDRND. This voiceprint and lifeness check technology can identify you during natural use of the product, completely eliminating any login or authentication procedure.

Crucially, zero-auth technology such as this allows a user to access any voice-enabled device in the world, and instantly received personalised content and services. The implications of this are hard to understate.


Interactivity is coming to our traditional content channels too. Pandora recently implemented interactive ads (commercials), and Spotify recently experimented with this concept too. My conversation with Instreamatic’s Stas Tushinskiy explores this in-depth.

Emotional Connection

The combination of interactivity with the inherent intimacy of audio can create a powerful emotional connection with the user. For children, Novel Effect bring the reading experience alive with their companion voice app.

For adults, voice dating is being explored by projects such as OnlyOne (my conversation with Marek was a particularly popular!).


Audio content is a marketer’s dream, with an Interactive Advertising Bureau (IAB) and Edison Research study showing that 65% of listeners are likely to buy a product after hearing an ad in a podcast. Conversion rates are through the roof with podcasts, often attributed to the feeling of trust listeners have with the host.

The Audio Opportunity

There’s a huge opportunity in audio content right now, due to the convergence of four growing trends: content marketing, podcasts, voice search, voice assistants.

Content marketing

Content marketing typically involves blogging, social media, email and video. The global content marketing industry is projected to enjoy a compound annual growth rate (CAGR) of 16% according to a 2017 forecast by market research firm Technavio.

PQMedia forecast a global content marketing spend of $330B in 2019, rising to a whopping $470B in 2021.

And Brafton’s analysis show that the most successful B2B marketers now spend 40% of their budget on content marketing.


The elephant in the room here is audio. There are now in excess of 1 million podcasts in the Apple Podcasts directory, and this number is growing at 38% CAGR.

The latest Infinite Dial 2020 figures leave no doubt that listening to podcasts is now a truly mainstream activity, with 55% of Americans having listened to at least one podcast, and 100 million having enjoyed one in the past month. This growth is driven in large part by the growing multitude of ways to listen to on-demand audio, such as hearables (e.g. Apple Airpods) and smart speakers (e.g. Amazon Echo).

CMOs and marketers are taking note. Marketing experts, SageFrog, report that 22% of US companies will invest in podcast marketing in 2020, and this figure is rising fast.

IAB figures predict that podcast marketing spend will exceed $1B in 2021, mostly on advertising. This is a CAGR of 30% since 2018.

Audio SEO

It must be noted that this growth won’t come from podcasts and flash briefings alone. We’ll see increasing amounts of audio content embedded on websites, and delivered as voice app responses, given the trend towards requesting content through voice interfaces.

A particularly exciting trend is audio SEO, where audio content will be returned as search responses. Google is increasingly treating audio as a first-class citizen, and returning podcast episodes in SERPs. Additionally they are returning videos with the seek bar moved to the exact point in the video that best answers the search query, and it’s highly likely we’ll see the same for podcast episodes.

Voice Search

Search Engine Watch reports that 20% percent of queries made through Google’s mobile app and Android devices are made with voice. Statistica figures show that 31% of smartphone users use voice at least once a week.

A hotly debated figure is from Comscore, who predicted that half of all online searches will be made through voice by 2020. Meanwhile, Gartner predicts that in 2020, 30% of online searches will be made on devices without a screen.

Whichever figures you choose to believe, voice search is a mega-trend that is hard to ignore, and returning audio content as a response will make sense for a significant number of voice searches.

The demand for audio shows no sign of abating, and consequently the demand for audio content creation will rise accordingly over the coming years.

You might like

The problem with audio

However, despite the boom in content and audio, only 10% of companies will produce any audio content themselves. From my personal experience as a podcaster, and from my many conversations with marketers, I have learned that this is because today’s audio creation tools are very slow and costly to use.

Companies that wish to produce high quality audio on a regular basis have two choices:

  1. Pay an agency or freelancer to create audio on their behalf, which is prohibitively expensive for most small to medium-sized companies
  2. Produce the audio in-house and invest heavily in equipment, software, training and employee time.

Here’s an outline of the steps in the traditional process to create a podcast:


  • To record high quality audio you need to hire or purchase equipment such as a microphone and an audio interface
  • To record guest interviews remotely, you will also need to buy recording software such as Squadcast or Zencastr


  • To edit the raw audio files, you will purchase digital audio workstation software such as Hindenburg or Adobe Audition (and learn how to use them)
  • You could also opt to use Descript or TypeStudio, which are tools that transcribe the audio into text, then let you edit the audio by simply editing the text.
  • Purchase the rights to use a jingle from companies such as AudioJungle.


  • To make your finished audio files publicly accessible, find a podcast host such as BuzzSprout or use a ‘free’ service such as
  • Submit your RSS feed to the many podcast directories, the most important being Apple Podcasts
  • Create an image for your show, perhaps with Canva or Visme.


  • It’s highly recommended you create a web page for your podcast episode. You could use WordPress, Wix, or SquareSpace.
  • Tell about your new audio by leveraging your existing marketing channels, such as social and email.

In my experience, a 1 hour podcast episode takes me 10 hours to produce. And I’ve produced over 75 podcast episodes to date, so I have well established processes and a lot of practice.

The same problem existed for websites 20 years ago; to establish a web presence you had to hire an expensive web developer to handcraft your website in HTML and CSS, costing thousands of dollars. Now, anyone can create a webpage in minutes using services such as WordPress or Medium.

Today, Audio faces the same problem. Audio content creation needs to be democratised with the development of new tools.

Additionally, education around audio content is poor, with some business leaders doubting the effectiveness of audio to generate leads and sales. Podcasts still have a stigma associated with them, often being seen as ‘indie’ and hence unprofessional. Also the download numbers for podcasts are usually far lower than the number of clicks or likes one can generate with a blog article or social media campaign. By failing to appreciate the difference in engagement quality between the content mediums, marketers often make the mistake of discounting audio as a viable marketing channel.

Another concern is around audio content discoverability; a very real problem that is being tackled by the some of the biggest tech companies in the world, such as Spotify and Google.

When you consider all of the above, it’s no wonder than most companies choose not to create any branded audio, and instead opt to simply advertise on third-party podcasts, if at all.

The solution: automate & democratise

As a solution to this problem, we’ve just launched Rumble Studio, the first end-to-end solution to plan, record, edit and distribute audio content in a fraction of the time it currently takes. It’s a web and mobile app to record guest interviews asynchronously; as a content creator, you simply write some questions, and invite your guest. That’s it.

The system then interviews the guest on your behalf in a conversational manner, asking the questions for you and recording the guest answers. We are building a conversational AI that will interpret the content of the guest’s answers using natural language processing, measure the response duration, voice emotion, and a number of other factors, and react appropriately.

The system then dynamically generates sensible and specific follow up questions, which lead to longer and more spontaneous interactions, and make the recorded audio more engaging and useful for listeners.

Audio editing is performed automatically by the system, and the finished interview can be hosted on Rumble Studio and submitted to podcast directories.

Save time & money

By automating the recording and editing, you save time and money by avoiding:

  • Scheduling (and rescheduling) meetings with the guest
  • Conducting the interview yourself and investing in equipment
  • Editing the audio yourself and learning to use software

New benefits & opportunities

The nature of this system offers many advantages and new creative opportunities:

  • Increase your content output, producing more audio in less time
  • Quickly canvass opinion, sending the same questions to multiple guests at once
  • Get better quality answers, as guests can take their time with no pressure
  • Access hard-to-reach guests who don’t have time for long interviews
  • Use a branded text-to-speech voices for host and/or guest

ROI increased 10X

Additionally, the atomic question-answer audio blocks can be easily repurposed in multiple ways:

  • Podcasts
  • Video clips for social media
  • Audio FAQs on websites
  • Voice app responses
  • Voice search results in Google
  • Many more

Rumble Studio makes audio creation as easy as writing a blog post, where audio is 10X faster to create, and 10X more easily distributed, repurposed and discovered. In this way, the ROI of audio content is increased 10X, making it a viable marketing channel for businesses of all sizes.

The MVP is now live, with hundreds of active users on the platform already. It’s currently free for beta users, so head over to and sign up now.

About the author

Carl Robinson
Carl Robinson

Carl is the host of the Voice Tech Podcast. Since launching in April 2018, Carl has conducted scores of in-depth interviews with voice industry experts, building one of the most well known media brands in the voice AI space.

He also publishes Voice Chops Tuesday, the number 1 voice technology newsletter, enjoyed by thousands of voice tech fans each week.

Carl is a startup founder, product manager and data scientist, and recently presented a model for voice emotion conversion at ICASSP 2019.

To contact Carl, use the links in this box to send an email, tweet or message.

Share this article

If you found value in this article, please consider helping others by sharing to your network. Just click one of the links below.

What do you think?

Related Posts

Vivatech Horizontal
Hans Van Dam
Alexa Custom Tts Image Josh Ziegler

Get notified about new articles

Upcoming Events

Featured Products