/insights · MixLab
Building MixLab Analyzer — BS.1770-4 LUFS in the browser
A working engineer’s walkthrough of how MixLab Analyzer works in the browser — K-weighting biquads, FFT, true peak, stereo correlation, and the design choices behind keeping it all client-side.
MixLab Analyzer started as a demo to prove a point: most “AI mastering” tools are a black box, and you don’t need a black box to give creators a sober read on their mix. You need a meter, a spectrum, a stereo correlation, and the discipline to write the result in plain language.
This is the working engineer’s walkthrough of how the current version is built. The goal isn’t exhaustive DSP theory — there are textbooks for that — but to show the design choices and the trade-offs we made to keep the whole thing client-side, fast, and honest.
The pipeline, at a glance
user file
↓ decodeAudioData (WebAudio)
AudioBuffer (Float32 PCM, sample rate, channels)
↓
├─ peak / RMS / crest → levels
├─ K-weighting (BS.1770 pre + RLB biquads) → LUFS / LRA
├─ true peak (4× polyphase upsample) → dBTP
├─ Mid/Side decomposition → width
├─ Pearson correlation → mono safety
├─ FFT (radix-2, Hann-windowed) → spectrum
├─ band integrals → tonal balance
└─ heuristics on band ratios → harshness / muddiness
↓
AnalysisResult
↓
plain-language feedback
There’s no model in this graph. There’s no API call. There’s a few hundred lines of TypeScript reading the Float32 arrays your browser already decoded.
Step 1 — Decoding without uploads
The single biggest UX win for an audio analyser is “your file never leaves the browser”. WebAudio gives you this for free:
const ctx = new AudioContext();
const arrayBuffer = await file.arrayBuffer();
const buffer = await ctx.decodeAudioData(arrayBuffer.slice(0));
decodeAudioData handles WAV, MP3, M4A, OGG, FLAC, and Opus across modern browsers. Once it returns, you have an AudioBuffer with numberOfChannels, sampleRate, and per-channel Float32Array. From here, everything is JavaScript arithmetic.
The .slice(0) matters: some browsers consume the underlying buffer when decoding, which breaks subsequent reads. Slicing creates a defensive copy.
Step 2 — BS.1770-4 K-weighting
The thing most “loudness meters” get wrong is they report RMS in dBFS. That’s peak/RMS metering, not loudness. Loudness needs perceptual weighting.
ITU-R BS.1770-4 (which EBU R128 builds on) specifies a two-stage filter cascade:
- A pre-filter — a high-shelf at ~1681 Hz that approximates the head-related transfer function.
- An RLB filter — a high-pass at ~38 Hz that approximates the ear’s low-frequency rolloff.
Both are simple biquads. For 48 kHz, the spec publishes the coefficients directly. For other sample rates, you re-derive using an RBJ-style high-shelf and high-pass with the same prototype frequencies and Q values:
function preFilterCoeffs(sr: number): Biquad {
if (sr === 48000) {
return { b0: 1.53512485958697, b1: -2.69169618940638, b2: 1.19839281085285,
a1: -1.69065929318241, a2: 0.73248077421585 };
}
// RBJ high-shelf derivation as substitute
const f0 = 1681.974450955533;
const G = 3.999843853973347; // dB
const Q = 0.7071752369554196;
// ... compute b0..b2, a1..a2
}
Applying a biquad is a 5-multiply per sample direct-form-I:
function applyBiquad(data: Float32Array, c: Biquad): Float32Array {
const out = new Float32Array(data.length);
let x1 = 0, x2 = 0, y1 = 0, y2 = 0;
for (let i = 0; i < data.length; i++) {
const x0 = data[i];
const y0 = c.b0*x0 + c.b1*x1 + c.b2*x2 - c.a1*y1 - c.a2*y2;
out[i] = y0;
x2 = x1; x1 = x0; y2 = y1; y1 = y0;
}
return out;
}
That’s the entire K-weighting filter. Run it on left and right (or just left for mono), keep them separate.
Step 3 — Gating and integration
LUFS isn’t just mean energy of the K-weighted signal. The spec demands gating:
- Block the K-weighted signal into 400 ms windows with 75% overlap (100 ms hop).
- Compute mean square per block.
- Compute block loudness:
L_k = -0.691 + 10·log10(mean_square). - Drop any block with
L_k < -70 LUFS(absolute gate). - Compute mean of remaining blocks’ mean-squares, derive relative threshold:
L_rel = -0.691 + 10·log10(mean) - 10. - Drop any block with
L_k < L_rel(relative gate). - Compute mean of those blocks’ mean-squares. That’s the integrated loudness.
The -0.691 offset isn’t arbitrary — it’s where the channel-summed mean-square aligns with reference levels. For stereo, you sum the L and R mean-squares before taking the log.
Implementing this in 30 lines of TypeScript is the entire LUFS metering codebase. No library, no model.
// per block
const ms = (sumL + sumR) / blockSize;
blockMeanSquares.push(ms);
blockLoudness.push(-0.691 + 10 * Math.log10(ms + 1e-12));
For Loudness Range (LRA, per EBU 3342), you take the short-term loudness series (3-second sliding window), gate at -70 absolute and integrated - 20, sort the remaining values, and take the 95th percentile minus the 10th. That’s your LRA.
Step 4 — True peak via 4× upsampling
Sample-domain peak misses intersample peaks — the actual analog peak after reconstruction can exceed any individual sample. Streaming codecs amplify the problem.
The standard mitigation is to oversample 4× before measuring peaks. For MixLab’s preview-quality estimate, we use Catmull-Rom cubic interpolation:
for (let k = 0; k < 4; k++) {
const t = k / 4;
const a0 = -0.5 * y0 + 1.5 * y1 - 1.5 * y2 + 0.5 * y3;
const a1 = y0 - 2.5 * y1 + 2 * y2 - 0.5 * y3;
const a2 = -0.5 * y0 + 0.5 * y2;
const a3 = y1;
const v = ((a0 * t + a1) * t + a2) * t + a3;
if (Math.abs(v) > peak) peak = Math.abs(v);
}
This is faster than a windowed sinc and gets within ~0.3 dB of a certified meter on typical material. For mastering-grade compliance, you’d swap in a polyphase FIR, but for “is this loud enough to alarm me?” — cubic is fine.
We limit true peak measurement to the first 10 seconds of the file to keep latency reasonable on long files.
Step 5 — Mid/Side and stereo width
This part is almost too simple:
for (let i = 0; i < n; i++) {
mid[i] = (left[i] + right[i]) * 0.5;
side[i] = (left[i] - right[i]) * 0.5;
}
rms(side) / rms(mid) gives you a width ratio. Scale and clamp into 0..1 for display. We also compute Pearson correlation between the raw left and right channels:
const correlation = (sumProduct - n * meanL * meanR) / Math.sqrt(denL * denR);
Correlation below -0.05 is the “your mix will lose elements in mono” signal. It’s what catches the phase-flipped channel and the over-widened image.
Step 6 — Spectrum analysis
For tonal balance, we need a magnitude spectrum. Standard recipe:
- Window 4096 samples with a Hann window.
- Run a radix-2 Cooley-Tukey FFT.
- Take
|X[k]|for the lower half. - Overlap-add the next frame (50% overlap).
- Average across frames.
The FFT itself is ~30 lines of vanilla JS. We don’t bother with a library:
function fft(real: Float32Array, imag: Float32Array): void {
const n = real.length;
// bit reversal permutation
// butterfly stages
}
For a 4096-point FFT on 10 seconds of 48 kHz audio, you process ~234 frames. On a modern machine this is well under 50 ms.
From the averaged magnitude spectrum we derive:
- Spectral centroid: the weighted average frequency by magnitude. Maps to perceived brightness.
- Spectral rolloff (85%): the frequency below which 85% of total energy sits. A second take on brightness.
- Spectral flatness: geometric mean over arithmetic mean. Close to 1 = noise-like, close to 0 = tonal.
- Band energies: integrated magnitude in 20–60, 60–200, 200–500, 500–2k, 2k–4k, 4k–8k, 8k–16k.
From the band energies, the harshness score is presence_band / mean(neighbour_bands). Anything significantly above 1 means the 2–4 kHz band is hotter than its neighbours — the bite that fatigues listeners.
Step 7 — The plain-language read
This is where most analysis tools stop. You get numbers and you’re supposed to know what to do with them.
MixLab’s feedback layer is dumb in the best way — a switch on each metric range with a hand-written paragraph for each band:
if (r.integratedLufs > -10) {
cards.push({
level: 'warn',
title: 'Loud — likely over-limited',
body: `${r.integratedLufs.toFixed(1)} LUFS sits well above streaming targets.`,
});
}
The right side of > is a number you read off the screen. The body is the kind of feedback a working engineer would type into a Slack DM. There’s no model, no template engine, no AI — just a few dozen carefully written branches.
What we deliberately didn’t do
- No “enhance” button. The analyser names problems. It doesn’t fix them. That’s a different product.
- No model in the loop. Everything is signal-level. The plain-language layer is dictionary lookups, not generation.
- No upload. The whole reason this works for creator privacy is that no audio ever crosses the network.
- No “score”. A single number across all metrics would be wrong every time. Per-metric reads are honest.
What’s next
The current pipeline runs on the main thread. For long files (10+ minutes), that visibly hitches the UI during the LUFS pass. The next iteration moves the heavy work into an AudioWorklet so the browser stays responsive throughout. Same algorithms, different runtime.
For mastering-grade compliance (broadcast submissions, festival deliverables), we’ll add a certified-meter mode using a polyphase upsampler and the verified coefficients at all standard sample rates. For now, MixLab is the right tool for “is this mix in the right shape?” — and the wrong tool for “is this technically R128-compliant for BBC Radio submission?”
Related
More in MixLab
-
Why AI mastering plateaued — and what creators want next
The first wave of AI mastering proved the category. The second wave has to earn it. A look at what readable feedback could replace black-box "enhance".
-
AI mastering tools, honestly compared (2026)
A working engineer's comparison of LANDR, eMastered, BandLab, iZotope Ozone AI, and CloudBounce. What they do well, where they fall down, and which to use for which job.