RBA Cash Rate: 4.35% · 1AUD = 0.67 USD · Inflation: 4.1%  
Leading Digital Marketing Experts | 1300 235 433 | Aggregation Enquires Welcome | Book Appointment
Example Interest Rates: Home Loan Variable: 5.20% (5.24%*) • Home Loan Fixed: 5.48% (6.24%*) • Fixed: 5.48% (6.24%*) • Variable: 5.20% (5.24%*) • Investment IO: 5.78% (6.81%*) • Investment PI: 5.49% (6.32%*)

The BM Text to Speech and Speech to Text API, Website Shortcode, and Elementor Block

The BM Text to Speech and Speech to Text API, Website Shortcode, and Elementor Block

We recently conducted an experiment in Artificial Intelligence that required highly accurate speech-to-text and text-to-speech functionality. In short, we assigned mortgage broker tasks to an AI and had the system carry a loan (with supervision) from a website-based AI-enquiry through to 'settlement'. The results give us a very clear indication of how the industry might evolve in the very short term, and it highlighted the efforts and technology that businesses might need to employ in order to effectively compete in a changing market. Every stage of the broker stack and workflow was faster and more efficient than human interaction - from assessing suitability for certain products, assessing risk, handling documents, through to 'CRM management'. A video discussion is forthcoming where we discuss the implications of this experiment. Until now, we've seen AI and tools such as ChatGPT used to perform functions - not tasks... and it's the understanding of a workflow, and the AI's ability to automate the process with virtually no human interaction that'll have very pressing and profound implications on those organisations that fail to implement appropriate changes into their operation.

In all cases, those files carried by an AI had a higher customer satisfaction rating when compared to those handled by a broker or processer, and in all cases AI was at least was twice as fast. The implications of AI systems in the finance industry will fundamentally change the way we do business. All aggregation groups and individual brokers need to assess their position now. The rate at which various products are making their way into the marketplace is unprecedented in the technology space, and those that fail to recognise the need for change are essentially driving the Titanic.

Our text-to-speech (TTS) and speech-to-text (STT) modules were used to automate various types of communication for the process we've just described, and both these modules are now available to clients through website shortcode, an Elementor widget, and the API. This article provides a brief introduction to the STT and TTS modules.

  Introduction

Text-to-speech (TTS) and speech-to-text (STT) functionality was required to give the AI a voice, and to accept human instructions from the broker. We connected the Ai to our VOIP system (meaning it was capable of making two-way phone calls), and we sent automated voice notifications via SMS and email. Rather than use an external system, we updated our own APIs in a way that was designed to be suitable for the our very specific needs. This API is now available to clients for general use.

The very first feature we'll introduce by way of the TTS system is an accessibility tool that will render your article audio at the top of each post on your website. It'a a basic application in terms of what the system is capable of, but that doesn't diminish the value of the new product. This feature will apply to articles you manufacture yourself, and articles sent to your website via the article distribution system. Discussed further shortly, some users may provide us with voice samples so we're able to emulate your own voice on your own website... otherwise a selection of default voices apply.

OpenAI: Interestingly, the open sourced backend STT feature (generating transcriptions from video or audio) is powered by OpenAI (the group behind ChatGPT), suggesting that they're ridiculously close to building speech recognition into their own tools. This is game-changing.

Partner Article Audio: Our partner article program has established itself as one of the most powerful partner tools in the industry, and it's supporting brokers with the fastest-growing partner networks. The accessibility audio will be made available in these articles, although it'll be a couple of weeks before we resolve the most effective method of including the feature.

As stated earlier, the APIs we're introducing are a small part of our future with AI, but they do give the Intelligence a voice and ears - essential when the systems play a part in the processing component of your operation.

  Text to Speech (TTS) API

The TTS engine isn't fast, so all requests are queued (with managed clients processed ahead of standard subscribers). A POST request is made to the API, and a response is returned indicating that the text is queued for processing. Once complete, a webhook is sent to a defined URL with details and download link. The webhook optionally carries the encoded MP3/WAV data in the request.

There are currently two distinct TTS engines. One produces very human-like speech (often indistinguishable from an actual voice), and the second produces a slightly more synthetic voice. The latter is available to virtually all clients because it consumes very limited processing resources. The former model is part of our ongoing AI development and is limited to selected managed clients and limited modules.

For the purpose of demonstrating voices, we used the following short snippet of text:

Belief Media provides the highest performing digital and lead generation solution in Australia. Whether you're looking to build an organic source of traffic, manufacture partner or referral programs, or improve your social visibility, we provide the highest-performing and full-owned solutions which is back by leading technology and support.

A large number of voice options apply for both models.

General Model

The following are default voice samples provided by the synthesis library, although as with the advanced model, we can (and will) manufacture our own suite of voices. Default parent voices are apope, arctic, hifi, ljspeech, ailabs, and vctk, although there are dozens of child voices available from within these collections.

Example audio is as follows (each took just a few seconds to generate).

[bm_tts voice="ljspeech_low"] .. [/bm_tts]

[bm_tts voice="popey"] .. [/bm_tts]

[bm_tts voice="arctic"] .. [/bm_tts]

[bm_tts voice="hifi"] .. [/bm_tts]

Various inflections may be applied, and mood may be modified by way of various classifiers, such as amused, angry, disgusted, drunk, neutral, sleepy, surprised, and whisper. Voice usage is introduced in more detail in the more comprehensive API documentation.

Advanced Model

The advanced model is commercial grade in nature, and returns results that are comparable with high-priced systems. Until a dedicated server is assigned, the model has limited availability. A few samples are as follows:

[bm_tts voice="atkins" version="1"] .. [/bm_tts]

[bm_tts voice="dotrice" version="1"] .. [/bm_tts]

[bm_tts voice="random" version="1"] .. [/bm_tts]

[bm_tts voice="random" version="1"] .. [/bm_tts]

As detailed below, the advanced model will reproduce your own voice in a manner that is almost indistinguishable from an actual recording.

Note: It's likely we'll initially introduce the advanced model into the automated SMS modules. This will limit the character count and reduce the overhead associated with processing the data. This will enable personalised SMS messages (in your own voice) with a name after various types of subscriptions and trigger actions.

The advanced model includes a large number of options to alter tonal quality, inflections, mood, and other vocal attributes.

Making API Requests

To create an audio file from the API, you'll need a function to submit basic data to Yabber, with the only required field being your text. By default, the request will send to the basic engine with a version value of '2', while a value of '1' will request processing via the advanced model. A sample function is as follows:

1
<?php 
2
/*
3
 BeliefMedia TTS API
4
 https://www.beliefmedia.com.au/text-apeech-api
5
*/
6
 
7
function bm_tts_request($text = '', $o = [], $headers = [], $project = '') {
8
 
9
 /* API Key */
10
 $apikey = BM_APIKEY;
11
 if ($apikey == '') return false;
12
 
13
 /* Text required */
14
 if ($text == '') return false;
15
 
16
 /* Defaults */
17
 $body = [
18
  'text' => $text,
19
  'project' => '',
20
  'voice' => 'apope_low',
21
  'speaker' => '',
22
  'noise_scale' => '0.667',
23
  'length_scale' => '1',
24
  'noise_w' => '1',
25
  'version' => '2',
26
  'project' => '',
27
  'webhook' => '',
28
 ];
29
 
30
 /* Merge Custom Options */
31
 if (is_array($o) && !empty($o)) $body = array_merge($body, array_filter($o));
32
 
33
 /* Endpoint */
34
 $endpoint = 'https://api.beliefmedia.com/tts/tts.php';
35
 
36
 $headers = [];
37
 $headers['apikey'] = $apikey;
38
 $headers['Accept'] = 'application/json';
39
 if (!empty($header)) $headers = array_merge($headers, $header);
40
 
41
 /* Body */
42
 $pv = ['1', '2', '3'];
43
 if (!in_array($version, $pv)) $version = 1;
44
 $post_data = (empty($body)) ? [] : $body;
45
 $post_data = json_encode($post_data);
46
 
47
 /* Request */
48
 $return = Requests::post($endpoint, $headers, $post_data, ['timeout' => '30']);
49
 
50
 /* Failed */
51
 if (!is_object($return)) return false;
52
 
53
 /* 200 (edit) & 201 (create) */
54
 $permitted_status = array('200', '201');
55
 
56
 /* Return Response Array or String */
57
 return (in_array($return->status_code, $permitted_status)) ? json_decode($return->body, true) : false;
58
}

Project ID: A project ID is a 32-character HEX string that identifies your project in Yabber. If not applied, a new project is created. To update any audio, include the project ID in your request (this is returned to you in the webhook).

You will either see an error, or a success response:

1
Array
2
(
3
    [code] => 200
4
    [status] => 400
5
    [message] => Invalid Access
6
)
1
Array
2
(
3
    [status] => 200
4
    [code] => 200
5
    [data] => Array
6
        (
7
            [project] => 989935aee27 ... 453b02ab306
8
            [status] => Pending
9
            [webhook] => a97253a2ae4b ... c3e6989ffe
10
            [scheduled] => 1800
11
            [time] => 1681671457
12
        )
13
 
14
    [message] => Array
15
        (
16
            [0] => Success
17
            [1] => Scheduled for Creation
18
        )
19
 
20
)

The text is queued in Yabber and is processed in turn. When complete, you will either receive an email notification, webhook notification, or no notification. When the created response is sent it will include a link to the audio file and a full suite of transcriptions made for the purpose of applying accessibility subtitles (the Speech-to-Text module is introduced in the next section). The response sent via webhook will look similar to the following:

1
Array
2
(
3
    [audio_url]=> https://api.beliefmedia.com/tts/projects/ .. 8ad6204cb798cf79f.wav
4
    [voice] => apope_low
5
    [version] => 2
6
    [noise_scale] => 0.667
7
    [length_scale] => 1
8
    [noise_w] => 1
9
    [project] => ca28e0dbe4605278ad6204cb798cf79f
10
    [filesize] => 991276
11
    [language] => English
12
    [translations] => Array
13
        (
14
            [json] => https://api.beliefmedia.com/stt/projects .. 5278ad6204cb798cf79f.json
15
            [srt] => https://api.beliefmedia.com/stt/projects/ .. 5278ad6204cb798cf79f.srt
16
            [tsv] => https://api.beliefmedia.com/stt/projects/ .. 5278ad6204cb798cf79f.tsv
17
            [txt] => https://api.beliefmedia.com/stt/projects/ .. 5278ad6204cb798cf79f.txt
18
            [vtt] => https://api.beliefmedia.com/stt/projects/ .. 5278ad6204cb798cf79f.vtt
19
        )
20
 
21
    [transcript] => ..
22
)

Introduced shortly, if the request is made via WordPress via shortcode or Elementor, the voice files and transcriptions are automatically sent back to your website for use locally.

  Providing Your Own Voice

To generate a voice modelled on your own, we require at least ten 10-second clips of your natural voice without excessive stuttering or vocal disfluencies (filler words such as 'umm' or 'aaaa'). You'll want to save the clips as a WAV file with floating point format and a 22,050 sample rate. You will want to avoid clips with background music, noise or reverb, and you shouldn't include speeches as they generally contain echos. In short, you'll want nice clean clips created specifically to train your own voice model. The idea source of data is actually reading a blog post from your website as this is where we'll initially use your model.

For all managed clients, we'll create a voice for you at no charge. For others, there will likely be a fee involved.

Suggested Usage: Some time back we introduced the value of introducing some nested funnel videos that address a funnel participant by name. In the beginning, we'd have a broker record a few hundred introductions before stitching that audio into a video. A personalised voice presents an opportunity to create these personalised videos with ease and in bulk. It'll be a little while before the functionality is built into Yabber, but it's not far off.

  Speech to Text (STT) API

The STT API uses OpenAI's Whisper  installed locally on our server. Whisper is a general-purpose speech recognition model. It is trained on a dataset of over 680'000 hours of diverse audio, and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. It is highly accurate, and almost always word-perfect.

The STT API works in a similar manner to the API. You submit an audio or video file to Yabber, and a basic success response is returned with an ETA for completion. The audio is queued and processed, and a json, srt, tsv, txt, and vtt file is created, with the SRT file the most widely used. When processing is complete, an optional webhook will be sent to a defined URL with links to download the various files. A complete transcription array is also included in the JSON response.

If using Yabber, and if the module is assigned to you, transcriptions are available in the STT module. All text is archived with keywords so you're able to assess your word count (useful for SEO) but also so you're able to conduct full text searches - extremely useful for recorded telephone calls and compliance (if integrated with your CRM, and if access was assigned to your account, transcriptions will be sent to your CRM as a note).

The TTS module will perform translations into a number of languages, and the languages are defined in the package sent to the API.

There are two primary methods used to transcribe audio: Yabber, and the API. Yabber is a simple point-and-click process, but the API requires a little technical knowledge. To use the API you'll need an API Key, and either CURL or a Requests library installed on your system. The following sample PHP function is the most basic implementation.

1
<?php 
2
/*
3
 BeliefMedia STT API
4
 https://www.beliefmedia.com.au/text-apeech-api
5
*/
6
 
7
function bm_stt_request($audio = '', $header = [], $language = 'English', $translate = false, $webhook = '') {
8
 
9
 /* API Key */
10
 $apikey = BM_APIKEY;
11
 if ($apikey == '') return false;
12
 
13
 /* Audio file required */
14
 if ($audio == '') return false;
15
 
16
 /* Mime Type & Ext */
17
 $mime = mime_content_type($audio);
18
 $ext = pathinfo($audio, PATHINFO_EXTENSION);
19
 
20
 /* Prepare */
21
 $audio = file_get_contents($audio);
22
 $audio_encoded = base64_encode($audio);
23
 
24
 /* Endpoint */
25
 $endpoint = 'https://api.beliefmedia.com/stt/stt.php';
26
 
27
 $headers = [];
28
 $headers['apikey'] = $apikey;
29
 $headers['Accept'] = 'application/json';
30
 if (!empty($header)) $headers = array_merge($headers, $header);
31
 
32
 /* Requires Translation? */
33
 $translate = (($translate === false) ? 'no' : 'yes');
34
 
35
 /* Body */
36
 $body = ['audio' => $audio_encoded, 'mime' => $mime, 'translate' => $translate, 'webhook' => $webhook,'language' => $language, 'extension' => $ext];
37
 $post_data = (empty($body)) ? [] : $body;
38
 $post_data = json_encode($post_data);
39
 
40
 /* Request */
41
 $return = Requests::post($endpoint, $headers, $post_data, ['timeout' => '30']);
42
 
43
 /* Failed */
44
 if (!is_object($return)) return false;
45
 
46
 /* 200 (edit) & 201 (create) */
47
 $permitted_status = array('200', '201');
48
 
49
 /* Return Response Array or String */
50
 return (in_array($return->status_code, $permitted_status)) ? json_decode($return->body, true) : false;
51
}

All fields are required or the API will throw an error. The audio extension is required for validation purposes and is discarded before processing (advanced usage will create another audio file on the basis of this argument). A webhook ID (sourced from Yabber) should be set in the body field if you wish to receive a webhook once processed. In the example above we have assumed a global BM_APIKEY was defined.

If an error is encountered you will receive a response with the message field noting the reason for the error:

1
Array
2
(
3
    [code] => 200
4
    [status] => 400
5
    [message] => Invalid Access
6
)

A successful response will return some basic information. Note that if you provide a webhook and that hook is invalid, the webhook field will be empty and a response will be added to the message array with a key of 3. The estimated time for completion in seconds is noted by way of the scheduled value.

1
Array
2
(
3
    [status] => 200
4
    [code] => 200
5
    [data] => Array
6
        (
7
            [project] => 433d4d1ee4397be0e612f0e05c6d63b3
8
            [status] => Pending
9
            [webhook] =>
10
            [scheduled] => 600
11
            [time] => 1681397879
12
        )
13
 
14
    [message] => Array
15
        (
16
            [0] => Success
17
            [1] => Scheduled for transcription
18
        )
19
 
20
)

The webhook response includes the full transcription and links to individual transcription files. Note that the webhook system, and an example PHP script for processing the incoming data, is introduced in an article titled "Webhooks in Yabber". The response below is snipped for length, and certain fields were truncated with an ellipsis.

1
Array
2
(
3
    [audio_url]=> https://api.beliefmedia.com/stt/stt/projects/ ... a235d16d99.wav
4
    [filesize] => 12973094
5
    [language] => English
6
    [project] => 3a34fcc193a70eadf01839a235d16d99
7
    [translations] => Array
8
        (
9
            [json] => https://api.beliefmedia.com/stt/projects/ ... a235d16d99.json
10
            [srt] => https://api.beliefmedia.com/stt/projects/ ... a235d16d99.srt
11
            [tsv] => https://api.beliefmedia.com/stt/projects/ ... a235d16d99.tsv
12
            [txt] => https://api.beliefmedia.com/stt/projects/ ... a235d16d99.txt
13
            [vtt] => https://api.beliefmedia.com/stt/projects/ ... a235d16d99.vtt
14
        )
15
 
16
    [transcript] => Array
17
        (
18
            [text] =>  In this video ... edit everything you have created.
19
            [segments] => Array
20
                (
21
                    [0] => Array
22
                        (
23
                            [id] => 0
24
                            [seek] => 0
25
                            [start] => 0
26
                            [end] => 6.96
27
                            [text] =>  In this video, we're going to introduce the My Quotes module in Yabba.
28
                            [tokens] => Array
29
                                (
30
                                    [0] => 50364
31
                                    [1] => 682
32
                                    [2] => 341
33
                                    [3] => 960
34
                                    [4] => 11
35
                                    [5] => 321
36
                                    [6] => 434
37
                                    [7] => 516
38
                                    [8] => 281
39
                                    [9] => 5366
40
                                    [10] => 264
41
                                    [11] => 1222
42
                                    [12] => 2326
43
                                    [13] => 17251
44
                                    [14] => 10088
45
                                    [15] => 294
46
                                    [16] => 398
47
                                    [17] => 455
48
                                    [18] => 4231
49
                                    [19] => 13
50
                                    [20] => 50712
51
                                )
52
 
53
                             => 0
54
                            [avg_logprob] => -0.18704535744407
55
                            [compression_ratio] => 1.5313807531381
56
                            [no_speech_prob] => 0.12789882719517
57
                        )
58
 
59
  [ ... snipped ... ]
60
 
61
                    [31] => Array
62
                        (
63
                            [id] => 31
64
                            [seek] => 10860
65
                            [start] => 127.74
66
                            [end] => 130.76
67
                            [text] =>  From here, you'll be able to review and edit everything you have created.
68
                            [tokens] => Array
69
                                (
70
                                    [0] => 51321
71
                                    [1] => 3358
72
                                    [2] => 510
73
                                    [3] => 11
74
                                    [4] => 291
75
                                    [5] => 603
76
                                    [6] => 312
77
                                    [7] => 1075
78
                                    [8] => 281
79
                                    [9] => 3131
80
                                    [10] => 293
81
                                    [11] => 8129
82
                                    [12] => 1203
83
                                    [13] => 291
84
                                    [14] => 362
85
                                    [15] => 2942
86
                                    [16] => 13
87
                                    [17] => 51472
88
                                )
89
 
90
                             => 0
91
                            [avg_logprob] => -0.11752236143072
92
                            [compression_ratio] => 1.7549019607843
93
                            [no_speech_prob] => 0.088475555181503
94
                        )
95
 
96
                )
97
 
98
            [language] => English
99
        )
100
 
101
)

The response headers will include your API Key for validation purposes, and to ensure your endpoint is free from abuse.

Zapier: We've avoided created an application on Zapier, but it's likely we'll do so very soon. We're a short time away from releasing our own version of Zapier but focused almost entirely on the mortgage market. This integration was required primarily to support our free broker plugin, meaning that users of the free product will have additional automation options made available to them - including GPT AI integration

  Yabber TTS and STT

STT and TTS functionality is integrated directly into Yabber. All requests are scheduled so it may take a few minutes before your data becomes available. A full record of all data is available in tables, and resulting renderings are available to download directly. Any request made via your website (and the associated data) is included in your creation log.

A feature we've used for years, and one that will be pushed as a global feature very shortly, is the automated transcriptions made via our VOIP systems. The transcriptions (and recordings) made from phone calls, conference calls, and notes, will be assigned to the relevant user for compliance purposes. All transcripts are fully searchable.

  TTS Elementor Plugin and Shortcode

All audio examples on this page were created with an in-post shortcode. The audio is created by wrapping the text you'd like created into audio with the [bm_tts] opening and closing tags [/bm_tts]. All the options in the request function are available as shortcode attributes (such as voice="arctic").

Processing Time: Once an article is published, the text of 'Audio Pending' will be returned in place of audio. It can take a short time before the audio and STT subtitles are retuned back to your website. This is done to avoid processing multiple audio files for a single post.

The Elementor plugin is very much an early tool that should be used with the understanding that it may change shape before it makes its way into a Release Candidate. Until we receive the appropriate feedback from clients we aren't entirely sure what functionality will be required.

BM TTS Elementor

  Pictured: Shown is the Elementor block used to create audio from text. You simply drag the textbox onto your page, enter the text, select a voice, and save. It will be a short time before the audio is returned back to your website. The inset shows the advanced options and the voice library. The webhook option allows you to send audio created within your website environment to external resources.

As best we can tell, our Elementor TTS Block is the only example of its kind.

  Considerations

In Development

As stated, the STT an TTS modules are in active development. While the systems are reliable and generally work flawlessly, they shouldn't be relied upon for mission-critical applications. The exception to this is when assets are created in Yabber - it's the website and funnel components that'll be the focus of continued development.

Speech is an Integrated Feature

Over the next few months you'll see us drip-feed the functionality into other applications. Our VOIP integration, for example, has always transcribed call audio into a searchable archive for compliance purposes, although this functionality was integrated with Microsoft's commercial STT systems (so incurred additional charges), so we migrated that task to Yabber. In terms of broader AI applications, these transcribed conversations form the basis for contextual and GPT systems to gain an understanding of previous interactions.

As mentioned earlier, the creation of custom audio messages to be used with the SMS module (as an MMS Voicedrop) with your own voice is in trial and pending release. This feature will likely also encompass the email marketing module, and will also be added to the trigger system for standard workflow actions.

We're currently trialing various types of video functionality that will create an animated avatar to match created audio. It's a resource-hungry operation so it'll be a while before we make is available, but it is a feature we're working with. This will enable you to create full videos based on nothing other than text.

Subtitle Transcript Files

When audio is sent back to your website, a full suite of transcription files are also included. The HTML5 audio element doesn't support audio subtitles by default, but the video element may be applied as an alternative... although this method is a little messy. When a custom video and audio container is applied we'll update the code to support the accessibility feature.

Improve Media Conversions

Some time back we provided an article on how we improved upon video conversions by adding the name of a funnel participant into videos, but this was accomplished by literally recording multiple introductions. The advanced API now provides a facility where this same functionality can be achieved (automatically) in your own voice before being applied to videos. It's areas like this where we might see some early efforts return improved funnel results.

Accessibility - Your Articles as Audio

As noted earlier, the tool might be best served as an accessibility feature first and foremost, and it's this functionality that'll likely be served by default sooner rather than later. It's quite possible that the tool will also be modified to return multiple languages. In all cases, consider reading out your articles in full, or perhaps include article audio littered with supporting chatter in a podcast-like format. All these options will be built into Yabber.

Custom Audio Container

The audio containers used on this page (at the time of publishing) are the default HTML5 containers, so they may look different based on the browser you're using. We have a custom audio container and Elementor widget planned for next month.

Ethical Considerations

AI introduces the obligation to maintain ethical standards. We're now exposed to a world where video and audio can easily be manipulated to emulate any person with simple training data. The audio below isn't overly good, but it does demonstrate how Tom Hanks' voice can be reproduced in a manner that may be deceiving. The only voices we'll manufacture on your behalf are those of your own team.

[bm_tts voice="tom" version="1"] .. [/bm_tts]

We routinely see this tech being used to create voices from people such as Elon Musk to support Binary and crypto trading scams. It's a massive problem.

Commercial Solutions

Our solutions aren't nearly as capable as many of the commercial systems made available, with Microsoft's STT a clear standout. Adobe provide a number of products, including Adobe Podcast - just one of many tools you should consider for more advanced projects.

  Conclusion

In terms of our AI ambitions, STT and TTS functionality are both very simple features, but the systems have provided our BeNet AI with a voice so it's able to interact with the real world.

With the introduction of open-sourced tools and accessible AI engines, technology reached an infection point last year that gave birth to an AI-uprising... and the development of these systems is exponential. Unless AI-based tools are introduced to your operation, the 'other' guy will process enquires faster, more efficiently, and with better customer outcomes.

Outside of our own client base, we've not seen a great deal of AI-based functionality introduced to the mortgage market, and this includes basic TTS and STT modules. We've not even seen accessibility features introduced to funnel assets, and this really does have to change. The industry is evolving in a way that's easier to identify when you're exposed to the technology, but most brokers are operating with a blissful unawareness of the mammoth changes that will potentially impact their existing business model.

Because of the continued development of this tool, you should follow our social channels for various updates and new features that might be made available.

  Featured Image: John Robinson Pierce (March 27, 1910 – April 2, 2002), was an American engineer and author. He did extensive work concerning radio communication, microwave technology, computer music, psychoacoustics, and science fiction. John Robinson Pierce, the former director of research at AT&T Bell Telephone Laboratories. Born in Des Moines, Iowa in 1910, Pierce was the first to evaluate the various technical options in satellite communications and assess the financial prospects. In 1952, he published an article in Astounding Science Fiction in which he discussed the potential benefits of satellite communications. Coined the term "transistor", instrumental in the development of Telstar 1, and wrote science fiction under the nom de plume J.J. Coupling. A few years later, Pierce greatly assisted in the creation of the first artificial communication satellite, ECHO. Pierce died from pneumonia complications on April 2, 2002 at the age of 92. Despite his accomplishments, he didn't see a value or future in speech recognition. In 1966, John Pierce chaired the “Automatic Language Processing Advisory Committee” (ALPAC) which produced a report to the National Academy of Sciences, 'Language and Machines: Computers in Translation and Linguistics', and in 1969, he wrote a letter to the Journal of the Acoustical Society of America, “Whither Speech Recognition”. The ALPAC stated that “The Committee cannot judge what the total annual expenditure for research and development toward improving translation should be. However, it should be spent hardheadedly toward important, realistic, and relatively short-range goals.” Pierce later said in a 1969 letter to HASA that “... a general phonetic typewriter is simply impossible unless the typewriter has an intelligence and a knowledge of language comparable to those of a native speaker of English.” “Most recognizers behave, not like scientists, but like mad inventors or untrustworthy engineers. The typical recognizer gets it into his head that he can solve ‘the problem.’ The basis for this is either individual inspiration (the ‘mad inventor’ source of knowledge) or acceptance of untested rules, schemes, or information (the untrustworthy engineer approach)." “The typical recognizer ... builds or programs an elaborate system that either does very little or flops in an obscure way. A lot of money and time are spent. No simple, clear, sure knowledge is gained. The work has been an experience, not an experiment.” “We are safe in asserting that speech recognition is attractive to money. The attraction is perhaps similar to the attraction of schemes for turning water into gasoline, extracting gold from the sea, curing cancer, or going to the moon. One doesn’t attract thoughtlessly given dollars by means of schemes for cutting the cost of soap by 10%. To sell suckers, one uses deceit and offers glamor." "It is clear that glamor and any deceit in the field of speech recognition blind the takers of funds as much as they blind the givers of funds. Thus, we may pity workers whom we cannot respect.” The speech recognition we take for granted today was built on the back of millions of hours, an educated bureaucratic opposition, short-sightedness, and over 70 years of development. Tim O’Reilly said it best: "An invention has to make sense in the world it finishes in, not in the world it started." [ View Image ]

Download our 650-page guide on Finance Marketing. We'll show you exactly how we generate Billions in volume for our clients.

  E. Australia Standard Time [ UTC+10, Default ] [ CHECK TO CHANGE ]

  Want to have a chat?
 

RELATED READING

Like this article?

Share on Facebook
Share on Twitter
Share on Linkdin
Share on Pinterest

Leave a comment