RBA Cash Rate: 4.35% · 1AUD = 0.67 USD · Inflation: 4.1%  
Leading Digital Marketing Experts | 1300 235 433 | Aggregation Enquires Welcome | Book Appointment
Example Interest Rates: Home Loan Variable: 5.20% (5.24%*) • Home Loan Fixed: 5.48% (6.24%*) • Fixed: 5.48% (6.24%*) • Variable: 5.20% (5.24%*) • Investment IO: 5.78% (6.81%*) • Investment PI: 5.49% (6.32%*)

How to Use the Text-to-Speech Features on Your Website

The Text-to-Speech module was initially designed as a website accessibility tool for those that are vision impaired. The idea is that we'd have an entire blog post or FAQ at the top of every page, although we found the output to generally be unrealisable with the inclusion of graphs, rates, and numbering, so we now recommend that brokers record a short podcast or video for each article, and Yabber provides a large number of tools that make assignment of any video to a page a piece of cake... and the video tools make creating the video as easy as recording it in Instagram. We know this consumes time, but we've made it as easy as we can, and even if you include a small number of videos you'll still be doing more than anybody else, and Search Engines will recognise the effort.

The Text-to-Speech module is now more of tool used by Yabber and its AI backend rather than something you'll utilise on your website, but it's there when and if you need it. This FAQ will details how to render an audio file with TTS shortcode or Elementor block. Note that there's a long TTS Article on our blog that provides further insight; we'll use the same examples from that page on this page.

TTS API: The article in our blog provides details for the TTS (and STT) API. It's very much an API-centric tool, and the API has become the primary method of integrating the tool with other applications.

  Basic and Advanced Modules

The TTS module comes in two flavours: General and advanced, and it's the former that is used by default. The advanced model is still reserved for managed clients and will likely remain as such.

  The Result

Example General audio is as follows (each took just a few seconds to generate). You'll note that it often fails to correctly identify infections and areas for emphasis, but it's mostly an excellent result.

For the purpose of demonstrating voices, we used the following short snippet of text:

Belief Media provides the highest performing digital and lead generation solution in Australia. Whether you're looking to build an organic source of traffic, manufacture partner or referral programs, or improve your social visibility, we provide the highest-performing and full-owned solutions which is back by leading technology and support.
[bm_tts voice="ljspeech_low"] .. [/bm_tts]

Audio Pending.

[bm_tts voice="popey"] .. [/bm_tts]

Audio Pending.

[bm_tts voice="arctic"] .. [/bm_tts]

Audio Pending.

[bm_tts voice="hifi"] .. [/bm_tts]

Audio Pending.

The Audio Pending Message: The audio above is used by a large number of demo systems and is often recreated with various inflections. You may see 'Audio Pending' while it's in the queue.

Various inflections may be applied, and mood may be modified by way of various classifiers, such as amused, angry, disgusted, drunk, neutral, sleepy, surprised, and whisper. Voice usage is introduced in more detail in the more comprehensive API documentation.

The Advanced Model is commercial grade in nature, and returns results that are comparable with high-priced systems. Until a dedicated server is assigned, the model has limited availability. A few samples are as follows:

[bm_tts voice="atkins" version="1"] .. [/bm_tts]

[bm_tts voice="dotrice" version="1"] .. [/bm_tts]

[bm_tts voice="random" version="1"] .. [/bm_tts]

[bm_tts voice="random" version="1"] .. [/bm_tts]

As detailed below, the advanced model will reproduce your own voice in a manner that is almost indistinguishable from an actual recording.

In our blog article we noted the possibly for the malicious use of the system by emulating celebrity and well-known voices. The following comes from Tom Hanks:

[bm_tts voice="tom" version="1"] .. [/bm_tts]

We routinely see this tech being used to create voices from people such as Elon Musk to support Binary and crypto trading scams. It's a massive problem.

  TTS Elementor Block and Shortcode

All audio examples on this page were created with an in-post shortcode. The audio is created by wrapping the text you'd like created into audio with the [bm_tts] opening and closing tags [/bm_tts]. All the options in the request function are available as shortcode attributes (such as voice="arctic").

Available Voices: Default parent voices are apope, arctic, hifi, ljspeech, ailabs, and vctk, although there are dozens of child voices available from within these collections.

Processing Time: Once an article is published, the text of 'Audio Pending' will be returned in place of audio. It can take a short time before the audio and STT subtitles are retuned back to your website. This is done to avoid processing multiple audio files for a single post.

The Elementor plugin is very much an early tool that should be used with the understanding that it may change shape before it makes its way into a Release Candidate. Until we receive the appropriate feedback from clients we aren't entirely sure what functionality will be required.

Search for 'TTS' in Elementor, drag the block onto your page , enter some text, save the page, and wait.

BM TTS Elementor

  Pictured: Shown is the Elementor block used to create audio from text. You simply drag the textbox onto your page, enter the text, select a voice, and save. It will be a short time before the audio is returned back to your website. The inset shows the advanced options and the voice library. The webhook option allows you to send audio created within your website environment to external resources.

As best we can tell, our Elementor TTS Block is the only example of its kind.

  Providing Your Own Voice

To generate a voice modelled on your own, we require at least ten 10-second clips of your natural voice without excessive stuttering or vocal disfluencies (filler words such as 'umm' or 'aaaa'). You'll want to save the clips as a WAV file with floating point format and a 22,050 sample rate. You will want to avoid clips with background music, noise or reverb, and you shouldn't include speeches as they generally contain echos. In short, you'll want nice clean clips created specifically to train your own voice model. The idea source of data is actually reading a blog post from your website as this is where we'll initially use your model.

For all managed clients, we'll create a voice for you at no charge. For others, there is a fee involved.

■ ■ ■

  Speech (TTS), Text (STT), and Accessibility

FAQs relating to peech (TTS), Text (STT), and Accessibility.

The Text-to-Speech module was initially designed as a website accessibility tool for those that are vision impaired. The idea is that we'd have an entire blog post or FAQ at the top of every page, although we found the output to generally be unrealisable with the inclusion of graphs, rates, and numbering, so we now… [ Learn More ]

  Related FAQs

Share this FAQ

Share on Facebook
Share on Twitter
Share on Linkdin
Share on Pinterest