Dan Ribar January 21st, 2013
It’s exciting to see how easy it is for our resellers and customers to customize their phone systems to meet their business needs – and then be willing to share their tips with others. In this guest blog post, Dan Ribar, CIO of 1st Guard Corporation, explains how to use a free version of Text to Speech in a Switchvox phone system. Thanks for sharing, Dan!
The natural progression for a new telephone system is to typically find it’s way to some form of dynamic IVR that looks to a database for the information. My office has had the Digium Switchvox system for a couple years now and feel pretty good about writing basic IVRs. Now, it’s time for some cool text-to-speech (TTS).
Why do I need TTS? In a basic IVR, you can record messages that may never need to change, like “Press one to reach customer service.” But, you may want to have a bit more dynamic text coming back to the end user, like “Your account is in great standing. Your account balance is $123.45 and you have one pending claim….” We wanted that dynamic text and began looking for a way to handle it.
After a few weeks of research, I found a couple of solutions.
1. Buy a TTS engine that runs on your server. (This will cost anywhere from $1,000 and up.)
2. Use a free TTS web service. (I couldn’t find any voices that could be easily understood so this wasn’t a good option.)
3. Use a paid TTS web service. (Well, cost is always an issue, but more importantly I didn’t want to rely on the performance of a web-based service to feed a dynamic IVR.)
4. Use the TTS built into Microsoft dot net framework. (Sounds great, but it requires a physical server with a sound card. Not a good option since my office is 100% virtual server based.)
5. Write your own.
Now option five seems a bit daunting. I mean, how do you write a TTS engine? Simple answer is you don’t. What you can do is package some free components to make it all work. Here’s how it was implemented at my office:
Then, used visual studio to write (or enhance in this case) the http listener for the TTS requests. I wanted a REST like solution so it would integrate well into the IVR on the Switchvox.
Some code would be nice, too. This is what the asp.net listener looks like:
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
If Not IsPostBack Then
Dim myWords As String = Request.QueryString(“myWords”)
FileName &= Trim(Now().Year.ToString) & Trim(Now().DayOfYear.ToString) & Trim(Now().Hour.ToString) & Trim(Now().Minute.ToString) & Trim(Now().Millisecond.ToString)
FileName &= “.wav”
Dim p As New Diagnostics.Process
‘ s = speed
‘ p = pitch
‘ a = amplitude or volume
Dim args As String = “-v en-us -s 150 -a 120 -w ” & FileName & ” “”" & myWords & “”" “
p.StartInfo.Arguments = args
p.StartInfo.FileName = “d:/data/espeak/command_line/espeak.exe”
p.StartInfo.UseShellExecute = False
p.StartInfo.CreateNoWindow = True
p.StartInfo.RedirectStandardError = True
Dim ttsErrors As String = p.StandardError.ReadToEnd
Response.ContentType = “audio/wav”
Response.AddHeader(“Content-Disposition”, “inline; filename=test.wav”)
That’s it! Pretty easy. Right? If your listener is called tts.aspx, you just call it with:
…and he returns a wav file.
It’s simple to then integrate it into Switchvox. In your IVR, add an action type of ‘Play Sound From URL’ and add this line:
That’s almost everything. All that’s left is to review the following:
Is this solution:
Pretty cool? YES
LAN based? YES
Works well with asp.net, Visual Studio and IIS? YES