(read the first ‘graph hearing the iconic “movie trailer voice”)
VO: In a world where the position of sounds in the stereo soundstage is artificially adjusted. In a world where the entire every recording is an illusion design to suspend disbelief…
Got a favorite stereo recording? Perhaps it’s that early 1970’s vinyl release of a classic album by Santana where guitars fly back and for between speakers. Or do you favor the wild ping-pong panning of a vintage Jimmi Hendrix? All of these were produced using artificially positioned sounds typically from multi-track recordings. And though classical recordings are typically far more minimalist, even London Record’s “Phase-4” multitrack recordings of the early 1960s, through the late 1970s were an admittedly abortive attempt at fully artificial “stereo” from 10 to 20 tracks of mono sounds.
The positioning of a sound in stereo is typically called “panning” and done on a mixing desk with the “pan pot”. All a pan pot does is adjust the intensity of a monophonic sound as it is split between the two channels. Center would be equal represented by equal levels in both channels, full left or right would be no signal to the opposite channel, and anything in-between can be dialed in as needed. And the result is sound is perfectly positioned between channels anywhere the engineer desired.
Except that it’s not. All a pan pot does is control level, and in life, the position of a sound source is not that simple. For a give angle of incidence, the relative level at each eardrum is dependent on frequency, with a greater differential at high frequencies, less a low frequencies. Then there’s the difference in arrival time, which, slight as it may seem, has a very large impact on perceived position of a source. The maximum time difference between our ears is around 640uS, and obviously changes with angle of incidence. So sensitive is our hearing mechanism to interaural delay that using headphones, nearly complete panning can be achieved with delay alone.
So the total picture involves time delay, and frequency dependent level differential, which has traditionally been so difficult to do on an analog console that it just wasn’t done. Even today with all the DSP anyone could dream of, every pan pot in the world is an intensity control only.
As you might imagine, if we extend our soundstage to 360 degrees, the real world panning algorithm becomes just a bit complex. The surround panner joystick is therefore also a level control only. And, it turns out, when played on speakers, stereo works pretty well with level only panning, which we’ll properly call “intensity stereo”, because sound position depends only on relative intensity only. It works because we’re playing stereo on two widely spaced speakers which introduce a pretty fair scrambling of the interaural delay difference because each ear hears both speakers with a relative time and intensity difference applied in the room that effectively swamps out the subtle inter aural differentials we may have heard were we in the original sound space.
The panning question is different in surround, with 5.1 speakers as a starting minimum. Remember my earlier post reference to the Bell Labs stereo experiments in the 1930s? They determined the absolute minimum channel count for good stereo sound positioning was three, Left, Center, and Right. The more speakers you have, the more accurate sound source positioning can be. Bell Labs concluded that the ultimate array would be hundreds or thousands of speakers on a huge wire frame grid. So, as we increase our channel count, intensity-only panning is all that is necessary, as the source is then located in physical space rather than virtualized by faking an intensity or timing difference between two speakers. Ah, that means high channel count music recordings have a more realistic soundstage! Yes, that is in fact the truth.
Going the other way, headphone stereo is the most sensitive to both intensity and timing differences. With just a little timing difference between channels, equal intensity signals to both ears can seem to pan nearly completely to left or right based on timing alone. This would partially explain why binaural recordings seem so real in headphones, but not so good on speakers. Binaural capture includes both frequency dependent intensity differences as well as time delay differences. In fact the intensity differences relatively small from the mid-band and below.
What all of this means to us home theater or stereo-only enthusiasts is, if we can get our hands on real multi-channel recordings for our 5.1 channel systems, the effect can be very palpable, and much more defined and less ambiguous that simple stereo. For stereo, if we can possibly reduce as many stray timing errors (reflections) as possible, our soundstage will contain at least some depth.
My earlier posts about the coming Dolby Atmos AVRs mentioned that system will add height to the equation, and do so by adding speakers either physically high or reflected off the ceiling from lower (more practical) positions. And that will also be a very good thing.
When it comes to palpable sound positioning, I’ll very loosely paraphrase a cynical line Harrison Ford spoke in “Six Days, Seven Nights”, “If you want a sound there, you have to put a speaker there”. If you don’t, the chances of locking a source to a position are pretty much nil outside of a head-locked sweet spot.