Project79068 Nayuki Minase


Subtractive Cancellation of Audio

Adding sounds is easy. It happens all the time, and it requires no special equimpent – it can be done live in the air.

But subtracting is another story. It requires amplitudes (intensity) and phases (timing) to be closely matched; any small deviation drastically reduces the effectiveness of the subtraction.

Theory

Here’s one way of modelling the problem: Take two waveforms, x and y. Let them be vectors of the same dimension, or sequences of the same length. Let’s assume that they’re time-aligned.

What we want to do is get z = axy (where a is an amplification variable) such that z has the least magnitude (equivalent to having the least RMS signal level, or average power). It is possible to find a in linear time (proportional to the vectors’ size) using a bit of calculus or linear algebra (your choice =) ).

Considerations

Japanese songs are typically distributed with normal and karaoke versions. These are ideal subjects for subtraction.

If you need to find the time alignment, first do a coarse alignment by hand (preferably using the spectrogram view). Then just try a number of possible nearby alignments by brute force and pick the best one.

Sometimes the time alignment between two similar-sounding song drifts. In this case, use a small block size and search some nearby alignments at each block (and print the offset for diagnostic purposes).

The amplification is often not constant over a whole song, mostly due to dynamic limiting. Try to use small block sizes, such as 1 second’s worth of audio.

Samples

Links

Last modified: 2007-05-19-Sat
Created: 2007-04-23-Mon