low-latency mixing: when 93ms matters and when it doesn't
why spectral plugins have latency, how much latency musicians can actually hear, how plugin delay compensation works, and when to use KERN LIVE mode for tracking.
the tracking problem
you are tracking a vocal. you want to hear compression in the headphones while the singer performs. you load your compressor. 93 milliseconds of delay. the singer hears themselves a beat behind. you bypass it. this is the latency problem.
here is what most latency discussions get wrong: they treat latency as one problem. it is two. during mixing, latency is invisible. your DAW compensates for it automatically, shifting every track so they line up perfectly on playback. you could have 200ms of plugin latency on the vocal bus and the mix would still be phase-aligned to the sample. the DAW handles it. you never notice.
during tracking, latency is a disaster. the performer sings into the microphone. the signal travels through the interface, into the DAW, through every plugin on the channel, and back out to the headphones. every millisecond of that round trip is a millisecond the performer hears between the physical act of singing and the sound arriving in their ears. at 93ms, they hear themselves nearly a tenth of a second late. try clapping your hands while hearing the clap 93ms after the impact. it breaks your timing.
this is why spectral plugins live on the mix bus and not on the tracking chain. not because they sound wrong at low latency. because 93ms of monitoring delay makes it physically difficult to perform.
key takeaway
93ms is a mixing feature. it is a tracking bug. the same latency that gives your spectral plugin better frequency resolution makes it unusable for the performer hearing themselves through headphones. your DAW fixes it for mixing. nobody fixes it for tracking.
why spectral plugins have latency
a standard EQ or compressor works sample by sample. audio comes in, gets processed, goes out. the delay is a handful of samples at most. your DAW barely notices.
spectral plugins work differently. they need to see a chunk of audio before they can do anything, because they need to decompose that chunk into individual frequencies. the tool they use is the Short-Time Fourier Transform, which converts a window of audio samples into a frequency spectrum. the plugin reads the spectrum, decides which frequencies need processing, applies gain changes per frequency band, and converts back to audio.
the window has to fill before the transform can run. a 4096-sample window at 44.1 kHz takes 92.9 milliseconds to fill. that is the latency. it is not a bug, and it is not inefficiency. it is physics. the plugin literally cannot know what frequencies are present until enough audio has arrived to measure them.
the window size determines frequency resolution. a 4096-sample window at 44.1 kHz gives you bins roughly 10.8 Hz wide. that is fine enough to distinguish a resonance at 3 kHz from one at 3.01 kHz. shrink the window to 1024 samples and latency drops to 23.2ms, but the bins widen to 43.1 Hz. the plugin can still see broad spectral shapes but loses the precision to separate nearby resonances. it is not a slider you can move for free.
(think of it like a camera exposure. longer exposure captures more light and more detail. shorter exposure freezes motion but the image is noisier. you pick the exposure for the shot.)
the latency math
latency in milliseconds = window size in samples / sample rate * 1000. at 44.1 kHz: 4096 / 44100 * 1000 = 92.9ms. at 48 kHz: 4096 / 48000 * 1000 = 85.3ms. at 96 kHz: 4096 / 96000 * 1000 = 42.7ms. higher sample rates reduce the wall-clock latency of the same window size, but the frequency resolution stays the same in Hz. KERN’s standard mode uses a 4096-sample window. LIVE mode uses 1024 samples.
how much latency can you actually hear
the question is not whether 93ms is perceptible. it is. the question is where the threshold sits, and the answer depends on who is listening and what they are doing.
Lester and Boley measured monitoring latency with trained musicians in a studio tracking scenario.[^1] their finding: 10ms of steady latency was not rated significantly different from 0ms by most performers. the session felt normal. timing stayed tight. the threshold for noticeable degradation sat somewhere between 10ms and 20ms for the majority of subjects.
McPherson, Jack, and Moro tested digital instrument latency across multiple performer types.[^2] their key insight was that jitter matters more than magnitude. 10ms of steady latency was tolerable. 10ms plus or minus 3ms of jitter was not. the inconsistency broke the performer’s sense of connection to the instrument more than the delay itself did. this matters for plugin chains where buffer sizes fluctuate under CPU load.
Jack, Mehrabi, and McPherson extended the work to perceived instrument quality.[^3] percussionists detected latency at 3-5ms. keyboard players tolerated more. the instrument type and the performer’s training shaped the threshold. but across all subjects, latency below 10-12ms was rated as acceptable for performance.
where does 93ms land? clearly above the threshold. every performer in every study would notice it. 12ms, however, sits inside the range that most musicians rate as usable. not zero. not invisible. but close enough that the performer can track without fighting their own timing.
plugin delay compensation
every modern DAW includes plugin delay compensation. Ableton Live, Logic Pro, Pro Tools, Reaper, Cubase, Studio One. the implementation details differ but the principle is the same: the DAW reads the reported latency of every plugin on every track, finds the highest latency in the session, and delays all other tracks by the difference. the result is sample-accurate alignment on playback.
this works perfectly for mixing. you can stack spectral plugins across your entire session and the mix stays phase-coherent. the DAW adds the delay automatically. you do not need to think about it.
the problem is monitoring. when the singer is performing, the signal travels from the microphone through the interface, into the DAW, through every plugin on the channel strip, and out to the headphones. PDC aligns the recorded audio after the fact. it does not reduce the round-trip monitoring delay the performer hears in real time.
heads up
PDC does not fix monitoring latency. the singer still hears the delay. this is why “just let the DAW compensate” is not an answer for tracking. it is an answer for mixing only.
some engineers solve this with direct monitoring: routing the microphone signal straight from the interface to the headphones, bypassing the DAW entirely. this gives near-zero latency but means the performer hears the dry signal with no plugin processing. if you wanted them to hear compression or de-essing while singing, direct monitoring does not help.
the latency landscape in 2026
not all spectral plugins carry the same delay. architecture choices, window sizes, and lookahead buffers vary. here is where the category sits today:
| plugin | approach | low-latency mode | standard latency |
|---|---|---|---|
| Soothe 3 | per-bin spectral | 0ms (zero-latency) | ~52ms |
| Pro-C 3 | broadband + lookahead | 0ms (no lookahead) | 0-20ms |
| DSEQ3 | spectral de-esser | ~8ms (eco) | 8-557ms |
| smart:comp 3 | spectral + AI | 0ms | 0ms |
| broadband comps | sample-by-sample | n/a | ~0ms |
| KERN standard | 4096 STFT | n/a | ~93ms |
| KERN LIVE | 1024 STFT | ~12ms | n/a |
KERN is not the lowest-latency option. Soothe 3 and smart:comp 3 offer true zero-latency spectral processing. Pro-C 3 runs at zero when lookahead is disabled. if latency is the single most important constraint in your workflow, those are strong choices.
KERN LIVE at 12ms is a trade-off. it is low enough for tracking. it is high enough to preserve most of the spectral resolution that makes the standard mode precise. it is not zero, and i will not frame it as zero. (honesty is cheaper than returns.)
LIVE mode: how it works
toggle LIVE in the plugin UI. the latency drops from ~93ms to ~12ms. the spectral display updates to reflect the new resolution. the DAW re-reads the reported latency and adjusts PDC. that is it.
what changes under the hood: the FFT window shrinks from 4096 samples to 1024. the frequency bins widen from ~10 Hz to ~43 Hz. the LIVE engine omits some of the processing stages that depend on fine spectral resolution, including spectral reassignment, the cepstral smoothing floor, and the transient flash detector. the gain smoother still runs. the ERB filterbank still runs. the core processing still happens per-band. but the input to each band is coarser.
the trade-off is real but subtle on most material. on a vocal with a narrow 3 kHz resonance, the standard mode can see and suppress a 20 Hz-wide peak. LIVE mode sees the same region as a broader bump and applies a wider cut. the result is slightly less surgical. on a drum bus or a full mix, the difference is smaller because the spectral content is already broad.
right-click the spectrum display and you get an option worth knowing: HQ analysis mode. this runs the full 4096-sample FFT for the visualizer only, while the audio path stays on the fast 1024-sample LIVE engine. you see the full-resolution spectrum while processing at 12ms. useful for spotting narrow problems you want to address after toggling back to standard mode for the mix.
CPU gating ensures only one engine runs at a time. when you toggle between LIVE and standard, a 50ms crossfade blends the two paths so there is no click or dropout. the inactive engine goes dormant.
LIVE mode is available in KERN SMOOTH v1.4.0, KERN WARM v1.5.0, and KERN PUSH v1.1.0. free update for existing license holders.
frequently asked questions
frequently asked questions
how much latency can musicians actually hear?
research shows 10ms steady latency is not significantly different from 0ms for most musicians. percussionists are more sensitive at 3-5ms. 93ms is clearly perceptible but your DAW compensates for it during mixing. the problem is only during tracking, when the performer hears the delay in their headphones.
why do spectral plugins have more latency than EQs or compressors?
spectral plugins use an FFT window to analyze frequency content. the window must fill with audio samples before processing starts. a 4096-sample window at 44.1 kHz takes ~93ms to fill. smaller windows reduce latency but also reduce frequency resolution. standard EQs and compressors work sample-by-sample with no windowing.
what is plugin delay compensation and when does it fail?
your DAW delays all other tracks to match the highest-latency plugin, keeping everything in sync for playback and recording. it works perfectly for mixing. it fails for the performer during tracking because they still hear the delayed signal in their headphones while singing or playing.
does KERN have a low-latency mode?
yes. SMOOTH v1.4.0, WARM v1.5.0, and PUSH v1.1.0 all have an opt-in LIVE mode that drops latency from ~93ms to ~12ms. toggle LIVE in the plugin UI for tracking. toggle it off for mixing to get full spectral resolution.
should i mix with low-latency mode on or off?
off. standard mode uses the full 4096-sample FFT for maximum spectral resolution. LIVE mode trades some resolution for speed. the difference is subtle on most material but matters for critical mixing decisions. use LIVE for tracking, standard for mixing.
references
a note from the developer
a live mixing engineer named Keith wrote to ask if KERN would ever get low-latency. he runs front-of-house for mid-size venues and wanted spectral compression on the vocal bus during the show, not just in the studio after the fact. at 93ms, that was not happening. his monitors would have been nearly a tenth of a second behind the singer’s mouth.
the same week, oeksound shipped Soothe 3 with zero-latency mode. i went back to the DSP literature.[^4] the question was whether a dual-window architecture could give KERN a usable tracking path without gutting the spectral resolution that makes the standard mode precise. the answer was yes, with trade-offs. 1024 samples instead of 4096. coarser bins. some processing stages omitted. but 12ms of round-trip latency instead of 93.
12ms is not zero. i will not pretend it competes on that axis. Soothe 3 at 0ms is the better latency number. but 12ms is fast enough for Keith to run at a show without fighting the monitoring delay. that was the bar. the research said 10-12ms is where most musicians stop noticing.[^3] so that is where LIVE mode sits.
if you track through spectral plugins and have a take on where the threshold actually sits for your instrument, send it. jonas@kernaudio.io. the best guides come from engineers who have tested the limits i have only read about.
try it yourself
KERN PUSH: three compression characters across 40 spectral bands. $29, no iLok, no subscription.