Writing for audio: proven tips and tricks to make the most out of a text-to-speech solution

by Ron Jaworski

Having software read the text for you is not just challenging for the software’s creators – it’s also challenging for the content creators themselves. To make sure your audio content reads nicely, you need to put just a little bit more effort from the beginning and write for audio.

It’s an important aspect because it will mean a great deal for user experience (UX), and at the end of the day, it’s the UX that will separate great content creators (and great content!) from the mediocre ones.

For the next few minutes, I’ll do my best to draw your attention towards a few details that will eliminate potential headaches while using a realistic text-to-speech solution, as well as share a few tips on how to choose the right words and deliver quality storytelling in a voice-first world.

Keep both ears on the context

In any form of storytelling, context matters. In a voice-first world, it is paramount. When it comes to painting a clear picture from the start, text (or any other form of visual media, for that matter) comes with a huge advantage against audio. With text, if you’re pressed on time and want to read a piece fast, or if you’re just looking to see if it even tackles the topics you’re interested in, you can simply skim through it, catch a few headings and understand the essence quickly.

However, listening to a post is an entirely different beast – mostly because of the linearity of a spoken story. Also, don’t forget that a listener isn’t (as) actively involved (as a reader is), so be prepared for them to be doing something else at the same time.

That’s why you always need to keep things in context. Share the right information from the get-go, and then expand on it as you go. Don’t fall into the classic journalistic trap and try to squeeze as much information in the introductory part as possible. That will only overload your listeners, making them lose interest quickly.

Lose the demagoguery

Writing gives us the “luxury” of longer sentences. You break them down with commas, which also serve as a visual aid while reading. Also, readers can always go back and reread parts of the sentences to better understand the message. Even so, many SEO tools will suggest shorter sentences and with text-to-speech, this becomes infinitely more important.

A spoken word has its own rhythm, its own pace, and shorter sentences fit better in that respect. It’s no wonder that politicians and trained speakers opt for shorter sentences. Just put yourself in the shoes of a listener, what is easier for you to follow? A six-word sentence, or a 36-word one?

To make sure you’re properly understood throughout the piece, your sentences should never hold more than two ideas.

You might like

Describe things as much as you can

To increase engagement, you might want to make your sentences as descriptive as possible. People can forgive a spelling or grammar mistake, but nobody is going to tolerate a boring article.  

That means that heavy changes to your style will (most likely) be required. Don’t spare anything – verbs, adjectives, adverbs… go all out. Try to use as many analogies and metaphors as you can as they’re an amazing tool to explain an idea while sounding authoritative and convincing.

Don’t forget comparisons! Comparing things will help your listeners associate your ideas with something already familiar, helping you get your conclusions across with ease.

Also, make sure to take advantage of signposts – words such as therefore, consequently, thus, hence. These will ensure your article flows, as they are great at building descriptive narration. Think of yourself as the narrator – you need to tell your visitors where you’re going with your thoughts and why. 

Things to steer clear from

While some content will be a natural fit for audio, others could utterly destroy your vibe. It’s just the way text-to-speech is: great but not perfect. Be prepared to edit your content more for audio than text until you get the hang of it. Writing text and writing what’s essentially narration is like driving an F1 car and driving an off-road 4×4. In both cases it’s driving but it’s significantly different.

I’ve already mentioned avoiding longer sentences, but that’s not all you should avoid. It’s a bit like being Indiana Jones – there are a lot of tiny booby traps to watch out for. They’re not exactly going to kill you, but they won’t do your UX any favors, either.

Therefore – abbreviations, acronyms, and numerals – it’s best for you to simply write them out. Special symbols like the hashtag, or the “at” sign (sometimes also called “monkey” for whatever reason), fall into this category as well – just type them out.

Keep an eye on links, as well. While they may do wonders in a written world, in audio – they aren’t much of a help. Either omit them or write them out fully, if possible.

Homonyms (words that sound the same but have different meanings) are also a dangerous trap – make sure to use as many different synonyms as you can.

Slang is another important consideration. In many situations, it’s culture-dependent or linked to a specific location. Unless you know exactly who is listening, I’d recommend staying away from it. 

And finally, be careful when writing a phone number or an email – voice technology is still in its infancy and there are still kinks that need evening out.

Pro tip: (have someone) read aloud your work first

You’re going to proofread your work, right? So why not proof-hear it, too? Hearing what your audience would hear is the best way to understand if your content works, or not. Even better – you can paste and listen to the text in Trinity Audio’s demo page to hear exactly how things would play out. If you’d like to hear more about Trinity Audio’s solution, check out this podcast interview.

Final thoughts

As more content creators turn to audio content, now is the best time to start thinking about the quality and the improved user experience it brings. If you’re looking to monetize people’s attention, you should definitely try out a text-to-speech solution that has already grown affordable and accessible.

These tools are still a work in progress when it comes to certain elements. I’m confident that sooner rather than later, we’ll work out these kinks (and then some). Until that happens, make sure to follow these proven practices and provide your listeners with an amazing experience overall. 

About the author

FF8A61F2 C98D 45BB BC93 93D49EFA7C4A Noa Eshed
Ron Jaworski

Ron Jaworski is the co-founder and CEO of Trinity Audio. He is an adTech veteran and a big believer in the power of voice and audio.

Share this article

If you found value in this article, please consider helping others by sharing to your network. Just click one of the links below.

What do you think?

Related Posts

Pexels Alex Green 5699456 Petr Marek
Muddu Sudhakar Aisera
Marco Liuni Alta Voce

Get notified about new articles

[yikes-mailchimp form="2"]

Upcoming Events

Featured Products