Need a little help mastering sound files? In an interview with Digium’s Rod Montgomery, I get him to share some great tips and answers to common questions in this audio file primer, discussing the properties, aspects, and care and feeding of Asterisk prompts.
As a professional in the area of voice talent, I have many clients – especially those new to the process of working with Asterisk – who are a little stymied by the file format characteristics, the attributes of the sound files, and the proper techniques for converting and installing files so they sound the very best they can. To give some background, I’ve been voicing prompts for the Asterisk system since the very beginning and it’s an amazing community of developers who send me prompts to voice. I record them, and then they are freely distributed to the rest of the community. It’s a cumulative building of a massive database of prompts. This spirit of ordering the prompts, but then setting them free into the community — and benefiting from other people’s contributions — is what has made Asterisk the success it is.
That said, I have a confession. Like most other professional voice talent, I am not an audio expert. Sure, I’ve acquired a rather large working understanding of audio production in the 10 or so years I’ve been doing it full-time. But it’s been gained largely through trial and error, experimentation, and let’s be honest, through some mistakes. I have had a great advocate in many of the Digium staffers who have assisted in untangling various issues, but none have been as helpful as Rod Montgomery, Product Manager for Digium. He has been a calm, accessible source of information and has unsnarled many a mystery, particularly when it comes to helping clients of mine, who are new to Asterisk. The fact that he’s also a major audio wonk on his own time is a blessing. Naturally, I knew he’d be the ultimate source of information for a blog entry which hopefully simplifies the requirements for Asterisk files, and identifies likely causes for some of the more common difficulties which people may encounter when trying to implement files and have them play optimally.
Here’s my interview with Rod, which sheds much light on the technical aspects of Asterisk files.
AS: Rod, as people hear them over a phone line, what file characteristics does an Asterisk file have?
RM: Typical, plain-old telephone service transmits only a portion of the audio that is spoken from telephone to another. The human voice generates sound between 80 hz to 12,000 hz (12 khz) but a normal telephone transmits only 300-3500 hz. Even though Asterisk supports high-quality audio, the sound Asterisk plays or records is limited by the phones in use.
AS: If phone lines are limited to a certain level of hz it can transmit, why am I asked to record Asterisk prompts in high-res (16 bit, 48,000 — actually better than broadcast quality)?
RM: Your prompts are crystal clear, and allow customers more flexibility in their application of the custom prompts. As with digital images, down-converting a high-quality file can be useful; but up-converting a low quality file yields poor results. Recording at such a high sampling rate (48 khz) and bit depth (16 bit) provides the freedom to convert other formats while retaining much of the original quality as possible in the target format. It bears mentioning: Asterisk is smart enough to play the highest-quality prompt available for the type of phone is use.
AS: What happens if someone tries to install the high-quality sampling rate files directly into Asterisk? I understand that files don’t do well transcoding on the fly …
RM: While .wav for mat files can contain data at a variety of sampling rates and bit depths, Asterisk can rea only two kinds of these files: Microsoft WAV format at 8000 hz signed linear (with a lowercase .wav extension) or WAV-GSM, also called wav49, with an uppercase “WAV” extension. The GSM style is usually used with voicemail records or in email attachments. If you try to play a high sampling rate format Asterisk doesn’t understand, it will throw warnings which indicate Asterisk cannot play 48K, thus cannot open the file, and furthermore, cannot find a ulaw version of the file to play.
AS: Rod, if someone encounters a “scratchy” or “distorted” quality to the files which sound otherwise crystal clear on a standard computer media player, what are the likely causes?
RM: The most frequent mistake I see in audio production for telephony is creating files that are simply too loud. Some audio production tools will normalize files by default, saving them as “hot” as they can. This can cause files to “clip.” Try keeping the volume levels within desire ranges and keep levels consistent; also, try lowing the volume a bit before converting to Asterisk-compatible formats.
AS: Can anyone convert Asterisk files?
RM: Yes! Digium offers a simple, web-based service to convert to .wav, GSM, signed linear, and G.729. You can access the audio converter tool online. Also, the SoX command-line tool is popular for converting files in bulk or automating file processing. It’s often available in your favorite Linux distribution.
There are many factors which can also contribute to inconsistency or problems encountered when trying to implement Asterisk files — many of which can be configuration issues on the end-user side. Asterisk is known for its user-friendly nature and its straight-out-of-the-box usability, but when issues do arise, Digium has vast resources available to help troubleshoot. The Support Center is always a good place to start.
Check back for more on this topic – Rod imparted so much information, that we may drill deeper into the technicalities of Asterisk sounds. I’ll likely blog in a few weeks with a follow-up piece. In the meantime, Rod and I are *very* interested in your feedback about this article. Was it helpful? Any issues which were not dealt with, but that you’d like to see covered? Feel free to leave a comment!