4.1 10 min read

spectral compression explained

how spectral compression works, why it differs from multiband compression, and when per-frequency dynamic control gives you results that broadband and multiband compressors cannot achieve.

the problem with one gain knob

a standard compressor has one gain reduction stage. when a loud snare hit triggers the compressor, the entire signal gets turned down: the vocal, the guitar, the cymbal shimmer, everything. the snare is controlled, but the rest of the mix ducks with it.

multiband compression improves on this by splitting the signal into 3 to 6 bands with crossover filters. each band has its own compressor. a loud snare in the low-mid band triggers gain reduction in that band only, leaving the highs and lows alone.

spectral compression takes this further. a broadband compressor is a magnifying glass. a spectral compressor is a microscope. instead of 4 bands, it operates on hundreds or thousands of individual frequency regions simultaneously. each region has its own independent compressor with its own threshold, ratio, and timing. a resonance at 800 Hz triggers gain reduction at 800 Hz. the vocal at 2 kHz, the cymbal at 12 kHz, and the bass at 80 Hz are unaffected.

the difference is not subtle. it changes what compression can do.

three approaches to compression on the same signal. broadband compression (one gain change) affects everything. multiband (4 bands) controls each band independently. spectral (hundreds of bands) controls every frequency region independently.

how spectral compression works

the signal flow has five stages. each stage solves a specific problem.

stage 1: STFT analysis

the input audio is decomposed into individual frequency components using the Short-Time Fourier Transform (STFT). a sliding window (typically 4096 samples, with 75% overlap between adjacent windows) captures short snapshots of the audio. each snapshot is transformed into a complex spectrum: magnitude (how loud each frequency is) and phase (where in the wave cycle each frequency sits).[^1]

the FFT window size determines the frequency resolution. a 4096-point FFT at 44.1 kHz gives approximately 10.77 Hz per bin: enough to distinguish a C4 (261.6 Hz) from a C#4 (277.2 Hz). larger windows give finer resolution but add more latency.

stage 2: frequency grouping

raw FFT bins are too numerous and too narrow for practical compression. a 4096-point FFT produces 2049 unique bins. compressing each bin independently would require 2049 simultaneous compressors and would produce audible artifacts (metallic “musical noise”) because adjacent bins would compress at different rates.

the solution: group the bins into perceptually meaningful bands. the ERB (equivalent rectangular bandwidth) scale, derived from Glasberg and Moore’s 1990 research on human auditory filters,[^2] groups frequencies into bands that match how your ears actually resolve pitch. approximately 40 ERB bands cover the full audible range (20 Hz to 22 kHz), with narrower bands where your hearing is most precise (1 to 4 kHz) and wider bands where it is not.

ERB bands match human hearing. the narrow bands around 1-4 kHz provide more compression control exactly where your ears are most sensitive. linear bands (used by some spectral processors) distribute control uniformly regardless of perception.

stage 3: per-band level detection

each band gets its own level detector. the detector measures the signal level in that band over time, producing a smooth envelope that the gain computer can act on.

the detection type matters. RMS detection responds to the average level, making the compression smoother and more transparent. peak detection responds to transients, making it punchier. some spectral compressors use crest factor (the ratio of peak to RMS) to adapt the detection behavior based on the signal content.

stage 4: per-band gain computation

each band’s detected level is compared against a threshold. bands that exceed their threshold receive gain reduction according to a ratio and knee curve, just like a standard compressor. bands below the threshold are left alone (in downward compression) or boosted (in upward compression).

this is where spectral compression diverges fundamentally from multiband. in multiband compression, a loud transient in one part of a band triggers gain reduction across the entire band, including frequency content that was not problematic. in spectral compression, the gain reduction is precise to the frequency region that actually exceeded the threshold.

per-band gain reduction applied by a spectral compressor on a drum bus. gain reduction concentrates on the loudest frequency regions (kick at 80-100 Hz, snare at 200-800 Hz) while leaving the cymbals and high-frequency detail untouched.

stage 5: synthesis

the gain-modified magnitudes are recombined with the original phase information and transformed back to the time domain using the inverse STFT (ISTFT). the overlap-add method reconstructs the continuous output signal from the overlapping analysis windows.

a critical detail: phase is preserved from the original signal. only magnitudes are modified. this avoids the phase artifacts that multiband crossover filters introduce.

complete spectral compression pipeline from input to output.

why multiband crossovers cause problems

multiband compressors split the signal into bands using crossover filters: typically Linkwitz-Riley or Butterworth designs at 2 to 5 crossover frequencies. these filters are not perfect separators. energy leaks between bands near the crossover points, and the phase response of the filters creates audible artifacts when the bands are recombined.

the result: a “smeared” quality around the crossover frequencies, especially when different bands apply different amounts of gain reduction. if the low band compresses 6 dB while the low-mid band compresses 2 dB, the crossover region experiences an uneven gain transition that was not in the original signal.

spectral compression avoids this entirely. there are no crossover filters. the signal is decomposed by FFT (which is mathematically exact) and reconstructed by ISTFT with phase preservation. the frequency decomposition introduces no artifacts beyond the inherent resolution limit of the FFT window size.

the crossover artifact in practice

set up a multiband compressor with a crossover at 500 Hz. apply heavy compression to the low band only. solo the output and sweep a sine wave from 200 Hz to 800 Hz. you will hear the level change discontinuously as the sine crosses the crossover frequency. this is the artifact. in a spectral compressor, the same test produces a smooth, continuous gain transition because there is no crossover to create a boundary.

when spectral compression matters

spectral compression is not always the right tool. a broadband compressor is simpler, introduces less latency, and works perfectly when the entire signal needs the same dynamic treatment (a vocal that simply needs to sit more consistently in the mix, for example).

spectral compression matters when:

different frequency regions need different compression. a bass-heavy mix where the kick triggers pumping in the cymbals. a vocal where sibilance needs aggressive compression but the body needs gentle treatment. a mix bus where the low end needs glue but the top end needs to breathe.

upward compression is frequency-dependent. bringing up quiet high-frequency detail (room tone, cymbal sustain, vocal air) without also bringing up low-frequency rumble. this is per-band upward compression: each band has its own threshold for lifting quiet content.

you want compression that preserves the spectral balance. broadband compression changes the spectral balance because loud frequencies trigger gain reduction that affects quiet frequencies. spectral compression maintains the spectral balance by compressing each region independently.

before (grey) and after (magenta) spectral downward compression on a drum bus. the loud kick and snare regions are compressed while the cymbals and high-frequency detail are preserved.

key takeaway

spectral compression acts on every frequency independently. multiband compression acts on 4 to 6 fixed bands. broadband compression acts on the entire signal at once. the more independently each frequency is controlled, the more transparent the compression can be on complex material, but the more latency and CPU it requires. the right approach depends on the material and the goal.

the connection to resonance suppression

if spectral compression sounds familiar, it should. resonance suppressors like soothe 2 and KERN SMOOTH are spectral compressors with a specific goal: suppress resonant peaks rather than control overall dynamics. the signal flow is nearly identical: STFT analysis, per-band level detection, per-band gain reduction, ISTFT synthesis.

the difference is calibration. a resonance suppressor sets its thresholds to detect spectral peaks relative to the local spectral envelope. a spectral compressor sets its thresholds based on absolute level and user-defined parameters like attack, release, ratio, and knee.

this connection means the engineering behind spectral resonance suppression translates directly to spectral dynamic compression. the same ERB frequency grouping, the same temporal smoothing, and the same phase-preservation techniques apply to both tasks.

note

if you have used a resonance suppressor, you already understand the core of spectral compression. the difference is what the gain computer is optimizing for: peak suppression (resonance) vs dynamic range control (compression).

frequently asked questions

frequently asked questions

what is spectral compression?

spectral compression decomposes audio into individual frequency components using FFT analysis, then applies independent compression to each frequency band. unlike broadband compression (one gain change for the entire signal) or multiband compression (3-6 fixed crossover bands), spectral compression can control every frequency region independently, with hundreds or thousands of separate compressor instances running simultaneously.

what is the difference between spectral compression and multiband compression?

multiband compression splits the signal into 3-6 bands using crossover filters. each band gets its own compressor, but within each band, the compression is broadband. spectral compression decomposes the signal into hundreds or thousands of frequency bins using FFT, with each bin compressed independently. the key difference: multiband crossovers introduce phase artifacts at the crossover frequencies. spectral processing avoids crossover artifacts entirely by working in the frequency domain.

when should I use spectral compression instead of a normal compressor?

use spectral compression when a broadband compressor creates unwanted side effects: a loud snare triggering gain reduction that ducks the vocal, a bass note pumping the high-frequency air, or a resonance in one band masking detail in another. spectral compression acts only on the frequencies that exceed their individual thresholds, leaving everything else untouched.

does spectral compression add latency?

yes. the FFT analysis required for spectral decomposition introduces latency proportional to the FFT window size. a 4096-point FFT at 44.1 kHz adds approximately 93 ms of latency. this makes spectral compression unsuitable for live monitoring during recording, but it is transparent in a mixing context where DAW latency compensation handles it automatically.

what plugins use spectral compression?

FabFilter Pro-C 3 offers broadband and multiband compression (not true spectral). Sonible smart:comp 3 uses AI-driven spectral analysis. Harrison MPC uses spectral compression. the technique is also used in resonance suppressors like soothe 2 and KERN SMOOTH, which are effectively spectral compressors with thresholds calibrated to suppress resonances rather than control dynamics.

references

a note from the developer

building PUSH was the moment i realized that the architecture behind SMOOTH could do more than suppress resonances. the same FFT pipeline, the same ERB bands, the same per-bin gain smoothing. the difference was the intent: instead of reducing peaks that stick out, amplify content that falls behind. upward spectral compression.

the connection between resonance suppression and spectral compression was not obvious to me at first. it took building both before i saw the pattern: they are the same operation with the sign flipped. SMOOTH turns down what is too loud at each frequency. PUSH turns up what is too quiet. same engine, opposite directions. (sometimes the best ideas are just old ideas rotated 180 degrees.)

if spectral compression works differently in your workflow than i described, or if you have a use case i should know about, jonas@kernaudio.io.

built on this research

PUSH applies this science in real time. five knobs. $29. no iLok.