Using math to blend musical notes seamlessly
In music, “portamento” is a term that’s been used for hundreds of years, referring to the effect of gliding a note at one pitch into a note of a lower or higher pitch. But only instruments that can continuously vary in pitch—such as the human voice, string instruments, and trombones—can pull off the effect.
Now an MIT student has invented a novel algorithm that produces a portamento effect between any two audio signals in real-time. In experiments, the algorithm seamlessly merged various audio clips, such as a piano note gliding into a human voice, and one song blending into another. His paper describing the algorithm won the “best student paper” award at the recent International Conference on Digital Audio Effects.
The algorithm relies on “optimal transport,” a geometry-based framework that determines the most efficient ways to move objects—or data points—between multiple origin and destination configurations. Formulated in the 1700s, the framework has been applied to supply chains, fluid dynamics, image alignment, 3-D modeling, computer graphics, and more.
In work that originated in a class project, Trevor Henderson, now a graduate student in computer science, applied optimal transport to interpolating audio signals—or blending one signal into another. The algorithm first breaks the audio signals into brief segments. Then, it finds the optimal way to move the pitches in each segment to pitches in the other signal, to produce the smooth glide of the portamento effect. The algorithm also includes specialized techniques to maintain the fidelity of the audio signal as it transitions.
“Optimal transport is used here to determine how to map pitches in one sound to the pitches in the other,” says Henderson, a classically trained organist who performs electronic music and has been a DJ on WMBR 88.1, MIT’s radio station. “If it’s transforming one chord into a chord with a different harmony, or with more notes, for instance, the notes will split from the first chord and find a position to seamlessly glide to in the other chord.”
According to Henderson, this is one of the first techniques to apply optimal transport to transforming audio signals. He has already used the algorithm to build equipment that seamlessly transitions between songs on his radio show. DJs could also use the equipment to transition between tracks during live performances. Other musicians might use it to blend instruments and voice on stage or in the studio.
Henderson’s co-author on the paper is Justin Solomon, an X-Consortium Career Development Assistant Professor in the Department of Electrical Engineering and Computer Science. Solomon—who also plays cello and piano—leads the Geometric Data Processing Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and is a member of the Center for Computational Engineering.
Henderson took Solomon’s class, 6.838 (Shape Analysis), which tasks students with applying geometric tools like optimal transport to real-world applications. Student projects usually focus on 3-D shapes from virtual reality or computer graphics. So Henderson’s project came as a surprise to Solomon. “Trevor saw an abstract connection between geometry and moving frequencies around in audio signals to create a portamento effect,” Solomon says. “He was in and out of my office all semester with DJ equipment. It wasn’t what I expected to see, but it was pretty entertaining.”