Hi people, how are you doing? It’s my first post here, and I’m coming with a question.
For the impatient ones, here’s a TL;DR: - how would you go about processing and downconverting speech and music to a really small sample rate and limited frequency range without losing more quality than necessary in the process?
Now a longer version. First, for those of you, who don’t know what a PBX is: it’s a configurable computer system that’s responsible for routing calls, playing menus like “Press 1 for billing, press 2 for support,…”, or “The number you are trying to reach is not available at the moment”, or even for playing music while you’re waiting for someone to pick up your call.
So, I’m setting up a PBX system and I’ve recorded some messages and bought some music to use with it. The problem is, I have to convert all of it to 8kHz, 16bit, mono WAV files to use it on my PBX. These files will be played either directly to users, or they might get transcoded to different codecs (like G.729, G.711, G.722, GSM-HR, GSM-FR,…) by telephony providers somewhere along the way.
Now the first limitation for WAV files with 8kHz sample rate is the frequency range - the maximum is exactly 4000Hz. Anything above that just won’t be saved. Since this will be played over the phones, I’m thinking about completely cutting anything below 120-200Hz too. In no particular order I have to: EQ the audio, apply compression, loudness normalization, add “comfort noise” to silences,… But where in the whole process do you think I should apply these steps? What should I do before and what should I do after converting? Basically, how can I make the audio sound best in this really bad environment for audio?
Personally, I’m using Adobe Audition, and have the requirements specified above, but I’ll appreciate if you answer in a more general sense too, so that other people with a similar question can find this thread useful as well.