What is the best way to preprocess audio for a PBX phone system?

Brane · January 13, 2018, 1:57pm

Hi people, how are you doing? It’s my first post here, and I’m coming with a question.

For the impatient ones, here’s a TL;DR: - how would you go about processing and downconverting speech and music to a really small sample rate and limited frequency range without losing more quality than necessary in the process?

Now a longer version. First, for those of you, who don’t know what a PBX is: it’s a configurable computer system that’s responsible for routing calls, playing menus like “Press 1 for billing, press 2 for support,…”, or “The number you are trying to reach is not available at the moment”, or even for playing music while you’re waiting for someone to pick up your call.

So, I’m setting up a PBX system and I’ve recorded some messages and bought some music to use with it. The problem is, I have to convert all of it to 8kHz, 16bit, mono WAV files to use it on my PBX. These files will be played either directly to users, or they might get transcoded to different codecs (like G.729, G.711, G.722, GSM-HR, GSM-FR,…) by telephony providers somewhere along the way.

Now the first limitation for WAV files with 8kHz sample rate is the frequency range - the maximum is exactly 4000Hz. Anything above that just won’t be saved. Since this will be played over the phones, I’m thinking about completely cutting anything below 120-200Hz too. In no particular order I have to: EQ the audio, apply compression, loudness normalization, add “comfort noise” to silences,… But where in the whole process do you think I should apply these steps? What should I do before and what should I do after converting? Basically, how can I make the audio sound best in this really bad environment for audio?

Personally, I’m using Adobe Audition, and have the requirements specified above, but I’ll appreciate if you answer in a more general sense too, so that other people with a similar question can find this thread useful as well.

Mikeesholt · January 14, 2018, 5:27pm

It sounds like this is being used for an automated attendant on an ip-pbx.

Is this connected to the outside world via TDM telephone lines or via SiP trunks? If via SIP Trunks, what codec is being used? Generally, if via SIP, then the codec should be the same irrespective of the carriers.

If the end users intend to be narrow band, then the frequency range should be 300Hz to 3.4kHz. If Wideband, then 50Hz to 7kHz. Note that most smartphones are Wideband,

My suggestion would be to work to Wideband, but check that when the ‘Telephone filter’ is applied, ( or manually apply filtering above 3.4kHz and below 300Hz,) it still sounds okay.

Brane · January 14, 2018, 6:03pm

Hey @Mikeesholt , thanks for your suggestions! You’re right, it’s for an automated attendant.

The PBX is connected to the outside world via SIP trunk. The trunk itself is using G.711 A-Law (specified by provider), but the PBX (3CX), accepts only 8kHz, 16bit, mono WAV files. As I understand, the first “bottleneck” here is PBX, and because of that, the audio delivered will be narrowband in any case, right?

Mikeesholt · January 14, 2018, 9:04pm

G.711 is narrowband, therefore go for the 300Hz to 3.4Hz range. However it might be best to do as I suggest and create a Wideband version and then trim down, then should they upgrade you wouldn’t have to start from scratch.

If they want a better version, you can always charge as if you started from scratch!

Mike

Brane · January 17, 2018, 4:41pm

Thanks Mike, that’s good business advice