Hi-Fi Speaker Design Issues

by Bob Richards 2014-22

Link to front page of website, click here.

Intro

I've been researching, designing and building speaker systems since about 1967 as a hobbyist (when I was 12), and worked professionally in the Audio Electronics field on several occasions in the 1980's and 90's (see Who's Bob section of website or my Resume). I don't know it all (nobody does), but I have a lot of practical experience and may be better at explaining things than many others in some cases. I view every speaker system I do as an experiment. I'll often throw in something unusual just to see how it works (open baffle, various cabinet shapes, different crossover network types, the POLK Acoustic Holographic Generator technique, etc.). Theory is very useful, but there's no substitute for practical learning experience, especially when it comes to understanding acoustics. It's not real hard to design and build an "average" speaker system that most people will be satisfied enough with. Designing a "high end"speaker system can be very complicated, and that's part of what I like about it as a hobby. It challenges and inspires me.

General

The weakest part of any playback system is almost certainly the speakers and how they interact with the acoustics of the room they will be in.

When I set out to build a set of speakers, I first evaluate the room they will be put in, and decide tentatively where I'll put the speakers and how much space I'm willing to give up for them. Should they be bookshelf speakers? Should they be towers? Taller is usually better but how tall am I willing to put up with? Should they be open baffle speakers (which need to be placed at least 3 feet away from any walls in order to work right)? Should they be floor to ceiling Line Arrays? Should I hang them from the ceiling? Am I going to want more than two speakers for surround sound? Should they have separate woofer cabinets to handle the bass? Where would those go? Is there room for many poweramps so I can use active crossovers? Is efficiency (dB per watt) an issue? Do I want to optimize them for a sweet spot with the best possible stereo imaging, or do I want the best sound regardless of where in the room anyone sits? How loud do I want them to go? It's wise to get real clear on these kind of questions before starting down the path. A very great speaker can sound pretty bad in the wrong room.

One of the best arrangements I've ever heard was Bruce Penney's satellite speakers hanging from the ceiling (100HZ on up), several feet away from any walls, aimed downward at the couch, which was roughly 12 feet out from the sofa, (which had a tapestry on the wall behind it to absorb some of that acoustic energy). These "satalites were tri-amp'd (2 five inch KEF drivers D'Appilito style with a 1.5 inch phenolic upper-mid dome and a 3/4 inch Celestion dome super tweeter between them). Below 100HZ was handled by two large woofer cabinets (JBL 15 inch woofers with JBL 15 inch passive radiators). The system as a whole was 4 way, with 4 pole active crossovers and active EQ. To this day I'm not sure I've heard a better speaker system, and that was the early 1980s.

Other great speakers were the DBX Soundfield tower designed by Mark Davis. Although these had to be placed at least 3 feet from any walls, they sounded great anywhere in the room.

And then there's my brothers McIntosh ML-1 speakers from 1972 that have always impressed me (a 4 way bookshelf speaker designed by Roger Russel at McIntosh).

One surprisingly good sounding speaker I made (1976 - sorry no photo), was two 8 inch full range speakers and a Marantz 2 inch cone tweeter in a box made of 1 inch thick oak. One 8 inch ran active, and the other had a shorted voice coil and a fairly heavy washer glued to the front of the voice coil. Some article I had read suggested a certain size washer for a similar project, so I went with that tenatively, and it seemed to work great. This second 8 inch driver was acting as an electronically damped (due to the shorted voicecoil) passive radiator. The shorted voicecoil would reduce ringing, so tighter bass, but probably raise the resonant frequency. The one pole passive crossover network divided the spectrum at about 2kHZ. The cabinet internal volume was about right for just one 8 inch woofer, and was packed with fiberglas insulation for damping of internal resonances. I gave them to my dentist friend as payment for work he did for me. Years later I went to see him and was astounded at how good they sounded, hanging from the ceiling in his dental office.

Other speakers that impressed me include some Ampex bookshelf speakers made in the 1960's that had just a 6 inch woofer in a closed box and a 2 inch cone tweeter, which were actually placed on a bookshelf surrounded by shelves of books (which is excellent acoustically). Coarse, I was only 13 at the time, and getting drunk with a babysitter.

On one occasion I heard four Radioshack 3 way bookshelf speakers sound amazingly pleasant (placed on the floor - which rarely sounds best). They were wired "cross-coupled"; the left front was wired to the right rear and vice versa.

Another memorable experience was two 6 x 9 inch coaxial ceiling speakers in a large hair salon (spaced rather far apart - maybe 30ft.), which gave an amazingly enveloping stereo effect over a rather large area.

To a Beginner

My pages on this and other audio projects are pretty deep in theory, fancy tecchniques, etc. I don't mean to make it seem like a beginner can't have fun doing a simple version of a speaker system. Without a calibrated mic and pink noise generator, a speaker builder is left somewhat guessing at what to do for a crossover network. The math can get you close to accurate in many cases. Without a method of determining the optimal speaker cabinet internal volume, again you take a guess based on any info you can scrounge up. That was how I did it for the first several years of building speaker systems in my teenage years. All I had was a volt/ohm meter, a soldering iron, and a bunch of capacitors, coils and speaker drivers I had scrounged somehow. Not much more. Those systems weren't great, but they were good enough for teenagers listening to rock and blues music.

The first thing for a beginner to know is that any tweeter needs to have a capacitor in series with it, or it will probably blow. Tweeters can't handle bass frequencies at all. The first time I took apart a friends stereo to find out what a crossover is, it was just a 4uF non-polar capacitor wired in series with the tweeter. The woofer frequecy response rolled off mechanically at the high end (so no coil). A better crossover would have a coil in series with the woofer, so it's frequency response would roll off quicker and more predictably, and merge better with the tweeter.

The second thing a newbie should know is that frequency response is arguably the most important thing about a speaker. You want something pretty close to equal amplitude over all frequencies from about 30HZ - 18kHZ ideally. This is referred to as a flat frequency response. In the real world, most listening room acoustics will mess this up a lot, so don't feel like the drivers response needs to be perfect. Plus or minus 3 dB seems to be what most people want to have as a minimum of quality. Distortion and Xmax (how far the cone is designed to move in and out) are two other specs that are very important.

One of my first decent speaker systems (1967 - I was 12) was just a 12 inch woofer in a fairly large cabinet the size of which we just guessed at, and a "dual cone" 6 inch full range driver in a sub-enclosure acting as midrange and tweeter. I filled the cabinet and sub-enclosure with regular fiberglass to minimize internal resonance issues after noticing that all good speakers had this. I tried a few different capacitors for the mid/tweeter in the vicinity of 4uF - 10uF to see what sounded best, and that was it. I might have had an "L-Pad" attenuator on the 6 inch to match the level better too. Eventually I found out the math formula for calculating the size of the tweeter capacitor, and I was was thrilled. Also coil math, higher order crossover circuit math, etc. Years later when I started working for Tektronix, I got in with the EE's who had built triamp speaker systems and learned all about that. Now I'm 65 years old and still doing it; building speaker systems for fun. So don't let my picky/nerdy opinions scare you away from this potentially fun hobby.

Shortcomings of the Reproduction Process

Where to start... Most musicians I've ever known have very little idea how to get the best sound. They depend on professional sound engineers to make it good, or just copy what other musicians are having good luck with. They rarely complain about someone processing/enhancing their sound to make it more enjoyable. Many "purists" don't like to tamper with what the recording/mixing engineer created. Some of them don't believe in having tone controls... That's rediculous. The purpose of music is to feel good. Good tone controls can make a big improvement in the enjoyability in most cases.

Directional microphones have what's called "proximity effect" which means the frequency response (amplitude vs. frequency) varies with distance from the mic. When there is more than one mic recording anything, there will be comb filter effects due to crosstalk when the signals are combined electrically in the mixing process. This is why recording studios often have acoustic absorptive walls between each musician in a band to minimize this crosstalk, and use directional mics (cardioid, hyper cardioid, etc.). Sound sources are usually recorded in mono, and then the recording engineer uses fancy electronics to create a sense of stereo soundfield, reverb and etc. Orchestral recordings are usually recorded in some variation of "true stereo", but also have the crosstalk/comb filter effects to deal with, among other issues. Most mixing boards use amplitude panning for stereo effect image placement, and ignore the fact that the ear-brain mechanism senses image location by timing comparisons rather than just amplitude comparisons below about 1kHZ. They ignore this fact largely because the playback system will typically have "inter-aural crosstalk", which blurrs our ability to localize an audio image below 1kHZ based on timing cues anyway. There's the Bob Carver method of inter-aural cancellation (electronic), and the Polk Audio version (acoustic) each of which I've researched and built and like, but both require the listener to have their head exactly centered, and approximately the correct distance out from the speakers, AND minimal room acoustic issues especially laterally, for that to work well. These methods of inter-aural crosstalk cancellation also cause the center image to be a bit weak or "phasey" sounding.

David Griesinger, formerly of Lexicon, may be one of the most adept experts at recording music in stereo or any variations of surround sound. He has many papers available for reading on the web.

One of the most damaging parts of the reproduction process is how the acoustics of the recording studio and/or playback listening room affect the signal getting to your ear, or to the microphone in the first place. It's often damaging to the point of making me wonder why I go to so much trouble to make my speakers as perfect as I can. Despite these setbacks, it amazes me how good a job our species has done with this creative art.

Venue Acoustics

Many recording studios do what they can to have no acoustics. They build what approximates an "anechoic chamber" so there will be minimal room reflections which cause comb filter effects, and imaging variables. Other studios use a room that is acoustically "dead" at one end, and reverberant or "live" and the other end of the room for effect. Reverb and many other digital processor sound effects have become very sophisticated and are used most of the time these days, to "sweeten" the otherwise "dead" sound. There are those who try for maximum "fidelity" (flat response and no "sweetening" processing), and then there's the other 99% who design for most pleasurable audio experience; whatever it takes. Maximum technical "fidelity" is a good default or starting point in either case, but rarely yields a truly great bottom line experience by itself. More on that later.

In a typical listening room, there are many reflections, each causing a comb filter effect when combined with the direct signal path at your ear, and/or with each other. Those are different than room resonance ringing effects. Both are often substantial issues. At the higher frequencies above about 500HZ, there are so many reflections creating comb filter effects, that the nulls (cancelations) created by one reflection combining with the direct path at your ear is largely filled in by other reflection paths, that will have cancellations at other random frequencies. Because of the frequency resolution of the ear-brain mechanism, these comb filter effects above roughly 500HZ largely average out and can liven the sound a bit (short term reverb). Side wall reflections add a sense of spaciousness to the sound, but this spaciousness is always the same for every recording, so it could get tedious to listen to over time in some cases. It can also blurr or dominate any stereo imaging cues embedded in the recording, if there are any. Proximity of the speaker to the room boundaries (walls or large hard furniture), will have a large effect on the frequency response of the speaker, from the point of view of the listening position, especially in the lower frequency range (generally below about 700HZ). Many speakers are designed to depend on this room boundary reinforcement of low frequencies, while other speakers are designed to be used several feet from any walls. The latter will need to use what's called "baffle step correction" EQ, to compensate for the lack of reinforcement by the front of the speaker baffle, below somewhere typically around 700HZ (depending on several variables).

Published speaker driver frequency response graphs are done in the "professional" world with the driver mounted on a large wall, facing into an anechoic chamber (acoustically dead - no reflections at all). That arrangement is called an "Infinite Baffle", because the energy coming out of the back of the speaker driver is never able to get into the anechoic chamber where the calibrated mic is. Although there's a lot to be said for this method of standardization, this is not necessarily very similar to how these same drivers will be used in the real world. Acoustic energy diffracts, or bends, at the lower frequencies, but not so much at the higher frequencies. This means that the wall in that anechoic testing chamber will reinforce the lower frequencies significantly more than the front surface of a typical speaker cabinet (due to its size), in most cases.

Most rooms that have hard parallel walls and corners will ring at various frequencies. Ringing has both a start up time, and a decay time. They don't react instantanously to stimulus. A quick transient may not last long enough to cause significant obvious audible room ringing, but a sustained note in music is likely to. If your measurement system uses transient type test signals such as pink noise or a single short pulse (which theoretically has equal energy at all frequencies), you aren't likely to see the effects of room ringing. If you instead use tone bursts in a gaussian envelop (or any variation of that), you'll be able to see the effects of room and driver resonance (ringing). Real world music is usually made up of both transients and sustained tones, so it's best to look at how a room reacts to each when considering tradeoffs.

Either extreme, (too many room reflections and/or ringing vs. a "dead" room approximating an anechoic chamber) is considered less desirable. I agree with Linkwitz where he basically says that the acoustics of an average living room with all the typical furnishings is actually about as good as it gets. It's impossible to create a "perfect" reproduction system for many reasons. It's always about choosing the best set of tradeoffs for a given situation. On the bottom line, a room with good acoustics is a room that is pleasant to listen in. It has many reflections, but they all average out pretty much over most frequencies, and liven up the sound a bit, but not too much. The major damage done by typical listening room acoustics is usually below about 400HZ. Down there it's more likely that the comb filter effect nulls are perceptually further apart in frequency, due to the size of the wavelengths, and how they interact with the room boundary dimensions, which can make bass frequencies sound "boomy" (favoring certain frequencies way more than others by the time the acoustic energy gets to the listener position).

20HZ - 100HZ

This is the frequency range of bass, where most of the warmth comes from. Personally I love low bass. It's presence will have a significant "psycho-acoustic" effect on how you perceive the upper frequencies. Virtually all percussive instruments have significant energy in this range, not just a "bass" instrument. It's been said that the sense of depth in the soundstage is highly dependent on the bass response going low enough. "Good bass" in my opinion, is a speaker that is acoustically somewhat flat down to at least 40HZ. "Great bass" goes down to 30HZ flat within a few dB. Below that is arguably even better, but many recordings that actually have energy that low are often EQ'd by the recording engineer for the "typical speaker" or "studio monitor", which means they've exaggerated the very low end in order to compensate for the rolloff of the typical commercial speaker system. This can be too much with a speaker that measures acoustically flat down to 20HZ (my experience). Because of this, I find that 30HZ is actually a better place to extend to. It takes significantly more amplifier power and cone surface area to function well down to 20HZ rather than just 30HZ. Plus, the lower frequencies go through walls and bother neighbors more if that's an issue. Movie theaters use concrete block walls to try to isolate bass leakage between rooms in a multiplex theater building, because nothing else can do the job. It's about mass at bass frequencies. All the conventional acoustically absorptive materials do little or nothing at bass frequencies. They start to work in the midrange frequencies and are usually great at the high frequencies.

Due largely to the way the ear works (see the Fletcher-Munson curve), most of the energy in a typical piece of music will be in the bass frequencies, which usually means large amounts of speaker cone movement. Because of this, it's arguable that the best place for a crossover frequency would be at around 80 - 600 HZ depending on a bunch of variables. Most bi-amp or tri-amp systems that I've seen choose a crossover frequency in this range. A single driver doing the whole frequency range will have significant audible FM (frequency modulation) distortion of the higher frequencies when a bass note or drum hits. Plus, that bass note may cause the poweramp to clip (distort), and it will trash the entire frequency range for that moment. With bi-amping, only the woofer would distort, and the rest of the frequency range would very likely stay clean. I prefer a 4 pole active crossover to separate the woofer from the rest of the drivers, and also use active EQ to make a woofer in a sealed box be acoustically flat down to 30HZ at the listening chair, with a fairly steep drop off below that (so the driver doesn't get damaged as easily and you don't waste a bunch of the amplifier power on sound you don't need or want). This method give me the tightest and cleanest bass, due to the consistent mechanical damping of the woofer cone over frequency, by the air inside the closed box baffle.

M&K made an active woofer that I used to have (Volkswoofer 3B) that had both a 4 pole 125HZ active rolloff, and active EQ making the woofer somewhat acoustically flat down to 20HZ. It also had an adjustment that would roll off everything above 50HZ at a one pole rate (6db/octave). I thought it sounded great in the rooms I was living in at the time (several - I moved a lot in those days), and the optional 50HZ pre-rolloff adjustment came in handy when dealing with typical room acoustics problems. My room had a big resonance at 100HZ. That tweak reduced that problem down to nothing significant. Much less "boomy" sounding. Great idea.

Enclosure types:

Folded Transmission Line baffles are way too big and heavy, so not practical IMO. Vented or ported boxes have good physical damping of the woofer cone at the frequency they are tuned for (usually around 50HZ), but the physical damping gets weak on either side of that "tuned" frequency, so using active EQ to make the woofer acoustically flat at the listening position down to 30HZ can cause vented woofers to "bottom out" (coil gets mashed and/or deformed), thereby permanently damaging the driver, so active EQ is not generally recommmended for vented box designs. Venting a box (or using "passive radiators") is sort of the cheaper way to get lower bass, and many vented box designs sound pretty good. Personally I feel that being able to use active EQ is a huge plus, so I'm a big fan of closed boxes and active EQ for the woofers. The bass will be flat to a very low frequency, and have good damping all the way down (tight bass), consistent over frequency. Plus, the closed box can be substantially smaller than a typical vented box will need to be, for a given speaker driver size. If you can get the more detailed speaker specs for the woofer driver you are using, you can use a software such as the WIN ISD free software to determine the optimal internal volume for a given driver.

It's arguable that getting bass to be relatively flat at the listeners chair is one of the hardest things to achieve, due to room acoustics, thereby typically causing bass and lower midrange frequencies (typically up to about 400HZ) to sound "boomy". Both room resonance and comb filter effects are typically particularly bad in this frequency range. But getting this challenge conquered can be one of the most satisfying improvements one can make on any speaker system situation. Some people have resorted to having 4 different woofer cabinets placed around their listening room, so some of these problems average out. The floor to ceiling vertical line array crowd claims they have no problems with this kind of thing (They've effectively gotten rid of the floor and ceiling acoustic effects, but still have the side walls to contend with). Getting the woofers further from any room boundaries (especially corners) may help significantly. Every room is different, so a lot of experimenting goes on when the best sound is desired. Active EQ can be helpful for this, but generally only fixes things for one location in the room, and can actually make it worse in other locations in the room, so should be used sparingly. Active EQ should only ever be used to attenuate peaks in the frequency response, and not used to try and pump up cancellations due to comb filter effects in the accumulated frequency response at the listening chair.

100HZ - 1kHZ

This range of frequencies is where the fundamental frequency energies of human voice and most musical instruments are. The exact character of the sounds is more determined by the higher frequencies. Telephones were designed to do well from about 300HZ to 3kHZ, for maximum voice intelligabililty. When designing a baffle for drivers in this frequency range (and higher), it's wise to consider the size of the half-wavelengths, relative to the baffles internal dimensions, because of the possibility of significant internal cabinet resonances, which will re-emerge thru the speaker diaphrams. Cabinet internal absorbtive materials are likely to make a huge difference in this. For example, the half wavelength of 1kHZ is roughly 6 inches, which isn't far from an internal dimension of some speaker cabinets. 500HZ half-wavelength is about a foot.

David Griesinger, formerly of Lexicon, is one of my favorite audio engineers. In one of his papers he talks about how typical room acoustics will actually effectively reduce the sense of separation as you go down in frequency, below about 800HZ. He also notes that this is the opposite of what you want, for a truly "enveloping" and "immersing" sense of soundstage. Many researchers have noted that we perceive stereo image location primarily by timing comparisons below about 1kHZ, rather than amplitude comparisons above about 1kHZ. This opens the can of worms known as "Inter-aural crosstalk". Up to the frequency where the half wavelength is shorter than the distance between our ears (around 1kHZ), we perceive stereo image location by timing or phase comparisons left to right (and vice versa). Above that frequency, the ear-brain mechanism has no way of knowing which period of waveform it's comparing, so it switches over to amplitude comparisons from about 1kHZ on up (because of the distance between the ears, compared with the distance of the half wavelength).

In real life, any sound will be perceived by both ears. The further ear will hear the sound with a slight delay (in the general vicinity of 125uS) relative to the ear with the more direct acoustic path, and with a relative frequency response rolloff due to the head being in the way. When this happens only once (during the recording porcess, for example), the ear-brain mechanism knows how to perceive the location of the sound at all frequencies (which it does when using headphones). If it happens a second time by interaural crosstalk during playback (which it does with speakers), but with a second set of different time delays, the ear-brain mechanism then tries to decode two different sets of data, and gets confused, so you don't get a clear sense of image location below about 1kHZ. Room reflections will often somewhat recreate a sense of 3-D effect below about 1kHZ, but it's not what's in the recording, if that matters.

Some aspects of our perception of sound and image location are a continuously updated learned thing. By moving our head slightly, we get more clear on the location of certain sounds, because our brain knows how to analyze the differential.

There's a page on David Griesingers website where he talks about the benefit of using a circuit that reduces the L+R signal as you go down in frequency, from about 600HZ, which compensates for what the typical room acoustics and/or inter-aural crosstalk does. It gives the listener a more "enveloping" sound experience that is arguably more accurate. That might be a useful function for a preamp to have.

Anyway, in the frequency range of roughly 500HZ - 10kHZ, the shape of and damping of, the inside of the baffle are of utmost importance. A sphere shape is best, and a cube is the worst. Damping materials like fiberglas, or foam rubber work great at the higher frequencies but do little below about 500HZ. I glue 3/8 thick felt to all internal cabinet walls, then a layer of foam rubber, then peices of heavy carpet or thick felt in between layers of the fluffy stuff, to add acoustically sluggish mass. Mass is what works better at dissipating acoustic energy in the lower midrange frequencies. Any energy that doesn't get absorbed and dissipated by these damping materials will re-emerge through the diaphrams, and/or cabinet walls if they are flimsy.

Regarding speaker driver size, the smaller the diameter of the diaphram, the better the off-axis frequency response.

1kHZ - 6kHZ

This frequency range is where the ear is most sensitive, 2-4kHZ usually being the peak of sensitivity for most people (see Fletcher-Munson curves or equivalent). Due to the size of the wavelengths, image location is perceived primarily by amplitude comparisons left to right, rather than timing or phase comparisons. The ear is also most sensitive to distortions and abrupt differential phase changes in this frequency range. Since most playback systems will have the interaural crosstalk issue (headphones avoid that), this is the range that typically allows us to perceive the stereo effect well. This suggests that the frequency response of the upper mid drivers should be very well matched to each other in order to get the best stereo effect (sense of soundstage depth and width - separation of individual instruments, etc.). This includes any effects of the listening room acoustics. If one speaker is in a corner, and the other isn't; that's not helpful.

Most hard cone woofers and lower midrange drivers (aluminum, Kevlar, etc.) have substantial cone resonance's in this range of frequencies. That may create an elevated sense of presence or immediacy, but could get tedious to listen to over time. Because a resonance causes ringing, such a resonance could be more "coloring" than a calibrated mic and pink noise test might indicate. Because of the size of the wavelengths in this range (roughly 2 inches to 1 foot), the internal dimensions and shape of the speaker driver enclosure will have a substantial effect on the sound. A sphere with a driver diaphragm at the edge is arguably the best enclosure shape situation, but has a substantial baffle step issue since the outside of the enclosure would not reinforce the lower frequencies (1kHZ and below) much at all (not like the large flat wall in an anechoic chamber where the published frequency response graphs are typically made). Since acoustic energy is much more directional in the higer frequencies, the outside front of the baffle is not a huge issue here, although it will have some effect. You could use waveguide technology to reduce room acoustic effects to get better matching and thereby imaging accuracy at the "sweet spot", but at the expense of off axis response. Better sweet spot, but arguably less accurate sound to everybody else in the room. Wider dispersion in this frequency range will typically give you a more spacious sound due to side wall room reflections, and a more even sound for listeners sitting off to one side a bit, but any generated sense of ambience from room reflections is fake (not in the recording), if that matters, and can distract from any embedded imaging info in the recording.

The smaller the diameter of the diaphram, the better the off-axis frequency response.

6kHZ - 20kHZ

Research suggests that above about 6kHZ, we perceive image location on the up and down or Y axis almost as much as the X axis (left to right). This is where the wavelengths are so short (less than an inch or two) that the shape of the outer ear plays a part in decoding image location. This is the frequency range that will get somewhat absorbed by many objects in the typical living/listening room, and may therefore sound confined relative to the sounds in the other frequency ranges, especially off axis. This may be one of the main reasons why some high-end speaker systems have a rear firing tweeter, or a tweeter array, or even a tweeter facing straight up. A tweeter array runs the risk of creating significant audible comb filter effects as you move your head around, depending on how it's implemented. The rear firing tweeter requires that the speaker be out from the wall some distance so it can be effective (not necessarily the full 3 feet as mentioned above). Sigfried Linkwitz pointed out that the most important part of a tweeters frequency response is actually the lower frequencies, because most adults barely hear anything above about 15kHZ, and are particularly sensitive in the upper midrange (1kHZ - 5kHZ), where many tweeters operate. I agree with him on that point. In the past, the bigger challenge was to design a good tweeter that could actually perform well up to the limits of human hearing (20kHZ). Now days many tweeters are pretty flat to 20kHZ, and now it's the lower frequencies of the tweeter that are arguably the weaker area (due to distortion). Many dome tweeters are said to have "coil tilt" when hit with a transient, causing excessive I.M. distortion (Lynn Olson talks about this), which being at frequencies where the ear is most sensitive manages to be something to worry about. Most metal dome tweeters usually have pretty substantial resonance issues just above the audio frequency range, roughly 24-27kHZish. It's been said that this causes discomfort to humans over time, and is highly likely to cause discomfort for dogs and cats who typically hear well up into the 60kHZ+ range. So I avoid those. Personally I think ribbon tweeters sound the best above about 5kHZ, but they're usually relatively directional on one axis or another, which is less desirable in that frequency range unless you live alone and always sit in the sweet spot. There are some ribbons that are relatively small on both axis, that are a top contender (Fountek 1.5 inch, which I have and love, for example).

Off Axis Response

Very few tweeters perform real well below about 2kHZ with a 1 or 2 pole crossover. In a two way system with a 8 inch woofer, a one inch dome tweeter and a crossover at about 2kHZ, there will be a significant dispersion change right at the frequencies where the ear is most sensitive. At 2kHZ an 8 inch woofer has usually become significantly directional in its dispersion (a rolloff in the off-axis frequency response), and the 1 inch dome will have a significantly wider dispersion at that frequency. Room acoustics reflections can make this into a bigger deal. The response of the acoustic energy bouncing off walls before reaching your ear will have this anomaly. It's not usually a huge problem, but it can be, since that's the frequency range where the ear is most sensitive. This should be considered in any speaker design process.

At the higher frequencies, the large cone driver (woofer or midrange driver) will usually emit most of it's energy from the center of the cone (at or near the voice coil), rather than the whole cone, and thereby not be quite as directional as theory might suggest based on the rim to rim dimensions. Different cone shapes can help reduce this problem as well. Many 2 way systems would be better off with a one pole crossover (6dB/octave), so the abruptness of the off axis frequency response change would be more gradual over frequency. This may cause an increase of distortion in the tweeter, so the crossover frequency may be better off being a little higher. It's a tradeoff situation.

Crossover Networks

Active crossovers (which are placed ahead of the power amps for each driver) are very accurate, predictable, very low distortion. Passive crossovers are inefficient, difficult to calibrate, coils can be fairly expensive, and are the most common type in the commercial speaker market. Passive crossovers are very sensitive to source and load impedance. A low feedback tube poweramp can have an output impedance of 1 ohm or more, which can throw off the calibration of a passive crossover significantly in some cases. Speaker drivers come with a "nominal" impedance rating, that usually varies quite a bit over frequency. This must be taken into account when calculating the size of the reactive components in a passive crossover network.

Higher order (12 or 18 dB/octave rolloff rate) passive crossover networks can be a nightmare to get right, due to the changing impedance's of the drivers over frequency, part tolerances and the interaction between the crossover coils, capacitors and any resistors. You change the value of one part, and it can throw off how other parts in the circuit do their function. Only a handful of people in the whole world are crazy enough to try to do a 4 pole passively. Speaker companies claiming to have a 2, 3 or 4 pole passive crossover are often using the mechanical acoustic rolloff of a given driver as one or two of the effective poles, which can work OK depending on the drivers, but using a driver right up to where it rolls off can bring some nasty distortion issues to the front (slewing and/or ringing) in some cases that I've seen when looking at shaped tone burst test signals.

My approach for active crossovers and/or EQ circuits is to take an existing circuit from anywhere (Linkwitz website for example), stick it in a circuit analysis program (such as SPICE), scale the key part values to get the exact frequency and amplitude characteristics I want, verify the changes with the SPICE program, build it, verify it on the bench, hook it up to the system, set relative levels using a calibrated mic and pink noise, and I'm probably done enough. The higher slope rates mean drivers don't need to operate in areas of frequency where they don't work well, nearly as much, and phase related issues at the crossover frequencies (lobing effects and combfilter effects) are damaging over a much smaller percentage of the frequency range (half octave instead of maybe two octaves).

My approach for designing and calibrating a passive crossover network is as follows:

Select drivers that have a relatively flat frequency response over the range you want to use them. Think about power handling capability and off axis resonse. This should be helpful in choosing where in frequency you want to roll off one driver and bring in the next.

Measure the frequency response and impedance curve of each driver if possible. Published specs and graphs can be fairly accurate, but I consider them to be approxiamte. Impedance usually varies significantly with frequency, and is likely to throw off any calculations for crossover part values substantially if you just use the "nominal" impedance rating. A five inch woofer I had was nominally rated at 8 ohms, but measured 15 ohms at 5kHZ (where it started mechanically rolling off), where I had planned to cross it over to a tweeter. My calculations would have been WAY off if I had not measured the impedance at 5kHZ. Published graphs are better than nothing, but are always questionable.

Assuming you have reliable numbers to work with, you calculate the part values starting with efficiency matching resistors since they will directly affect the calculations of the reactive components (L's and C's). Then calculate the reactive part values based on the combination of the efficiency matching resistors and the impedance of the drivers at the crossover frequencies. The reactive parts (L and C) will roll off the frequency response based on what they see, looking out into the circuit. Then build the passive crossovers keeping coils well separated from each other and far away from speaker magnets. Put it all together and measure the result with the calibrated mic and pink noise (or if you want to get fancy use tone bursts in gaussian envelops).

You could be done there, in a perfect world. You could be lucky and it may sound fine.

I highly recommend designing the physical crossover such that you can pull off a jumper to turn off each driver, so you can look at each drivers acoustic output separately with the calibrated mic and pink noise. What I always find is that the drivers don't turn on and off at quite the frequency I had calculated them to. The math is allegedly accurate, but drivers are reactive devices with variable impedance over frequency. Does this matter much? It can. With a one pole crossover, you generally wire the drivers (woofer and tweeter) in phase with each other because there's only a theoretical 90 degree phase shift at the crossover frequency between the drivers (assuming the crossover circuit is very accurate). The reactance of the drivers will often add or subtract from that theoretical phase shift some. If the actual phase shift at the crossover frequency goes beyond 90 degrees, then swapping the phase of one driver relative to the other will result is a smaller acoustic cancellation at the crossover frequency when the output of the two drivers add in the air (at your ear). If there's an overlap in the two frequency bands due to a sloppy crossover calibration, the amplitude response may not show it as a major anomoly (compared with other anamolies), but that can do significant damage to stereo imaging in the crossover frequency region.

This may not be a huge problem, but worth paying attention to in the final cal'd mic tests. At any crossover frequency, there will be a beaminess that moves up and down (if the drivers are arranged vertically) due to phase shift introduced by the reactive components, as you sweep a sine wave through the crossover region of frequency. You'll get that effect when everything is accurate, but with an overlap it will be worse (woofer rolls off at 4kHZ and tweeter rolls up at 2kHZ for ex.). It will cause this varying beaminess to occur over a wider range of frequency (maybe 3 octaves instead of 2). Room acoustics will grab this and run with it (make it worse). If the crossover is in the upper-midrange frequencies, as it often is, and one side does it differently than the other (likely), stereo effect imaging will be compromised significantly.

So you build the whole thing based on your measurements and calculations, and then using the jumpers to verify that each driver is rolling off as it was designed to do with your crossover network. When it's wrong, you modify some part values to get it close enough (but changing any part value in a passive crossover may affect the function of other parts in that circuit). After you get satisfied with the amplitude response of each driver, you put all the jumpers back on and look for nulls in the acoustic response at each of the crossover frequencies. Now you reverse the phase of the tweeter to see if the null is deeper or shallower. Shallower is what you want. Predicting which phase will work best is difficult, since the mechanical characteristics of the drivers may introduce phase shift, and that may even be affected by the enclosure as well. The 4 pole active crossover has such a tight rolloff, you don't need to care as much about crossover region anomalies. If you go into mass production of a speaker system with a passive crossover, you have to worry about tolerance's of all parts involved, including the drivers themselves. Having said all of this, there are many speaker systems out there with passive crossovers that sound VERY good.

I used to think that building an active crossover and having to get more poweramps was definitely more hassle than just going the passive crossover route. In the case of a small 2 way speaker I'd go with passive. If I wanted a truly full range speaker system (good low bass too) that could go fairly loud, I'd bi-amp or tri-amp). Maybe I'm more picky than most people.

Some engineers use hard cone woofers and/or midrange drivers for better "resolution", but they all have the severe resonance in the upper midrange frequencies. They often use what's called a "Zobel" filter in the passive crossover network to null out the problem resonant frequency. That's good on paper, but what if the driver characteristics drift over time such that the Zobel null is off a bit in frequency?! Although seemingly not likely, you could end up with a peak right next to a null. I know people who believe you need to "break in" a driver by running high power sinewaves through it for many days continuously, before the specs become pretty much stable over time. I hope they are wrong. I haven't personally verified whether that actually makes a significant difference, but it does raise the question. It could affect the low end FR of the driver, but seems unlikely to me to be much of an issue. I ignore it.

The formulas for designing a 1 pole passive crossover are as follows:

Highpass filter: Xc = 1 / ( 2 pie time freq. times C in farads), or C = 1 / ( 2 times pie times freq times Xc). Xc is the impedance as seen by the capacitor.

Lowpass filter: XL = 2 pie times freq times L in henries, or L = XL / (2 times Pie times frequency. XL is the impedance as seen by the coil.

I always use the cheapest coils since I don't believe that having expensive extra heavy gauge flat wire made with special metals makes any real difference.

Ferrite core inductors (properly designed and rated) should be fine, and will be less reactive with other nearby coils.

Measure the DC resistance of any crossover coil and add that to the impedance of the driver and any efficiency matching resistors, when calculating what the crossover frequency will actually be.

Coils can be very sensitive to magnetic fields and can crosstalk with each other. Keep them several inches away from each other and speaker magnets.

Since driver impedance varies with frequency, and is often pretty far different than the rated "nominal" impedance at the frequency you want to have the crossover at, the final result can be pretty far off if you don't take that into account.

A 2 pole passive crossover is even more sensitive to this, so I won't encourage the 2 pole here. Many so-called 2 pole passive crossovers are actually using the mechanical acoustic rolloff of the driver as one of the poles.

How to measure driver impedance at a specific frequency:

Connect a sinewave generator to the poweramp, and set the generator to the frequency in question. Wire a 50 ohm (or so) variable resistor in series with the driver. This can (and should be) be done at a very low power level (one or two watts). Adjust the pot such that the voltage drop accross the driver is the same as the voltage drop across the resistor. Unhook the pot and measure it with a DC ohm meter. Done. This is mostly accurate, but if you're at a frequency where the impedance is changing fast over frequency, the final result can cause the math to be off a bit, but this gets you about as close as you can get on paper. The crossover components (coils and caps) are sensitive to the impedance at the crossover frequency, but also the impedance anywhere near that same frequency. Final testing with a pink noise signal and a calibrated mic and RTA (spectrum analyzer) will show if this is still an issue. You might then want to tweak the value of some caps or coils to get it perfect.

Testing/Verification

Again, pink noise as a test signal is transient in nature, so although it's apparently the more common and popular test method for looking at the frequency response of speakers using a calibrate mic and any variation of a real time analyzer display device, it won't show you much about resonant conditions in the speaker itself or the room acoustics, like tone bursts can. This is because any resonance has a start up time and a decay time. It's full significance can only be seen if it gets stimulation for long enough to reach it's peak level.

McIntosh and AR (Acoustic Research) back in the old days (1960's and 70's) dug a hole in the ground outside in a field, far away from any building, put the speaker system cabinet in the hole aimed straight up, positioned a calibrated mic several feet above it, and called that their anechoic chamber for verifying the frequency response of the speaker. If it's not too windy and there's very little noise around from other sources, that's pretty legitimate (although it ignores baffle step response). Taking measurements inside your living room is highly unlikely to be legitimate. I've learned this the hard way a few times. Yes there is sophisticated equipment and /or software that after hours and hours of set up and verification, may give you good enough results indoors (gated or windowed bursts for mid and high frequencies, and close mic for bass - then merging the two graphs). In a production environment this test equipment makes sense once all idiosyncrasies are understood and taken into account. For a one-off hobbyist project I'd go for the shovel.

There are those who consider their ears to be one of the best ways to judge a speaker. None of the Engineers I worked with a Dolby Labs agree with that. The audio memory is very weak, the characteristic of the ear varies over time and temperature, and the slightest change in listening room acoustics and/or visuals can sway one's opinion. Inner ear air pressure is a normal condition that varies over time, and changes your perception significantly (according to an Audiologist I got tested by and talked with). The brain somewhat adapts to this, but it's a variable that's hard to take into account with any accuracy or consistancy. If you compare speakers in different rooms you'll be pretty far off due to the effects of the room acoustics. Test equipment has it's limitations too (mostly in how it's used), so the ear does have it's place in the process, but as a secondary tool. More for judging how a speaker interacts with the room acoustics after all other efforts are done.

The Cumulative Spectral Decay (CSD) Graph

I thought this was the coolest thing ever when I first heard about it, since it showed not only the steady state frequency response but also any ringing over frequency on the Z axis (time), when the driver is turned off abruptly. But then I found out how these darn things work (Linkwitz explained it clearly, as opposed to all other Engineers I talked with about it). At the highest frequencies the CSD graph can be relatively accurate, but below about 1kHZ the time window on the Z axis is often shorter than one period (one full sinewave) of the frequency being analyzed (approx. 1 mS in the case of 1kHZ). So the method of computation is thrown off completely and the results are erroneous. The way this should have been done is to use tone bursts every 6th octave (with gaussian or equiv. burst envelops) at the lower frequencies, so there would always be several cycles (required for ringing analysis) of each frequency to be analyzed and displayed.

Surround Sound

In a dedicated theater room, there can be a good place for each speaker to be, and room for side and rear speakers to be larger and have a good low frequency response which is highly desireable for several reasons. David Griesinger points out in some of his papers that side and rear speakers will be much more effective and enjoyable if they have a good frequency response in the lower midrange and bass frequencies. In a typical living room, it's often much harder to find a place for all these extra speakers, and they will often need to be small speakers that barely make it down to 100HZ. Is it still worth it? Yea kinda. BUT, if you've got five surround speakers running, and you're watching your TV, and the commercials come on, or a Youtube video ends and the automatic next one is 10dB louder, it can be extremely annoying. I've kinda gone full circle on this. It's rare that I actually want to be surrounded by sound. During movies maybe, the rest of the time no.

More Info:

The linkwitzlab.com website is huge and full of good info on everything related to audio reproduction. One of the best website on audio there is, but may be a bit long winded and overly detailed for most people. Perhaps a bit biased toward OB speakers.

David Griesinger is another expert on anything to do with recording techniques (especially binaural recording), surround sound issues, or reproduction of audio in general. One of the best there is. He's been called the pioneer of electronic reverb synthesis, and was one of the main brains behind Lexicon products for many years. Google his name and read his many papers.

Zaphaudio is another website that I have a lot of respect for.

Roger Russel, formerly of McIntosh, now retired, has a great website that among other things will educate you on line array speakers, which he totally believes in and produces now in his retirement. In the right room, this could be the way to go. Google his name.

diyaudio.com has many interesting discussions on everything audio, and many of the contributors are highly educated professionals, but be careful what you believe. Many contributors are not very well educated, but still love to spew their opinions.

One of the most recognized books on the subject is, The Loudspeaker Design Cookbook, by Vance Dickason.

Apparently Madisound offers passive crossover design help for any drivers you might pick. If they actually measure the impedance of the drivers at the crossover frequencies, that would be the way to go for most builders.

Using published impedance graphs is way better than just using the "nominal" impedance ratings. They may or may not be very accurate though.

Speaker Cables:

I use 16 Guage AC line cord copper strand wire bought at the hardware store for about 50 cents a foot, unless the wire is more than about 12 feet long, then I'll go 14 guage. The wire itself is highly unlikely to be a significant weak link. Connectors, and how the wires are attached to these connectors can be significant distortion causing "mechanisms" in some cases. If the speaker wire is too thin, two things happen; you lose a small percent of wattage in the cable (usually less than a dB), and the damping factor of the amplifier on the speaker coils is lowered. Damping factor is how tightly controlled the speaker is, by the amp. Theoretically you would think that the highest damping factor possible would be the best, and the highest fidelity, which it arguably technically is. But since the damping formula includes the DC resistance of the voice coil (usually at least 5 ohms), putting an additional 0.1 ohms in series witha that makes a very small diffrerence. If you've spent more than about $50 on speaker wires, you got jerked. (Sorry guys) I believe that the power of suggestion is stronger than the difference between regular 16 AWG hardware store wire, and whatever is the worlds most expensive speaker cables.

If you compare "high-end" ridiculously expensive cables with the 16 guage hardware store wires, you might think the expensive cable sounds better, and here's why: The thinner wire will likely have a loss of a fraction of a dB compared to the expensive heavier cable. Not enough to be consciously noticable, but enough to make the expensive cable a tiny bit louder, which will seem to sound better because of that. The change in damping factor could change the sound a small amount too, but it's anybodys guess if that makes it better or worse, and the change is awefully tiny. High damping factor appears to be more associated with a "clinically clean" dry sort of sound that a lot of people don't like, and the looser lower damping factor is said to give the sound a slightly warmer organic quality, which the tube crowd loves. Bottom line; don't waste too much time and money on cables. Speakers, room acoustics, and quality of source material are better things to work on. Those are very likely the weak links in any Hi-Fi system. Using 99% isopropal alchohol or "contact cleaner" on any connectors is a good idea. Molecular migration over time can cause non-linearities between contact surfaces.

When preparing speaker cables, it's very wise to saturate the bare wire ends (twisted) with flux and solder, so over large amounts of time the various strands stay in good electrical contact with each other. Most metals oxidize to some extent when they are exposed to oxygen (gold doesn't but it's very soft and gold plating can rub off after several reconnections). I have actually seen a case where the individual strands of copper wire got so oxidized over time, that they had developed a very non-linear electrical connection to each other. The wire was 50 years old and had endured many significant temperature and humidity changes over time. The result was easily audible very bad distortion. When I heard it, I thought the power amp had blown a transistor.

Connectors are the part of speaker cables that are much more likely to be a weakness. I prefer Banana plugs and jacks where you can solder the wire to the connector pins, rather than just the little set-screw arrangement that could come loose over time. Banana connectors have good amounts of surface area doing the connection. Some connectors I see out there, especially in low cost and portable products, use connectors that have only the tiniest amount of surface area doing the electrical connection. The Pro-Sound market has some fancier, probably better connectors, but I haven't looked into that much. Probably very expensive, and unecessary for home audio.

Sigfried Linkwitz appears to be one of the best Audio Electronic Engineers I know of in the audio field, his website, Linkwitzlabs.com is packed with great info for any speaker builder. His recommendations for speaker wire are as follows:

Check out his website for more info on cables, Rf susceptibility etc.

Building my own speakers used to be how I afforded good quality speakers. Now days you can get decent speakers pretty cheap, if you're not too picky.

But researching, designing and building really great speaker systems has been a very fun hobby for me.