Adam Arsenault & Spencer Liberto
Autophony is the action of hearing your own self-generated sounds, such as hearing your own voice or hearing your own blood circulation. For our project, we wanted to reduce autophony. Specifically, our work sought to remove the sound of ones own voice through the use of delays and wave inverters, mixing the inverted wave form of a person’s voice with ambient sound, hopefully cancelling out a person’s voice while retaining all other sound.
This idea came about during the research process, when Adam was in the sound/machine group. I had thought about ways that we interacted with ambient noise through technology. I became interested in noise-cancelling headphones. I wanted to invert the purpose of noise cancelling headphones. Instead of reducing ambient noise and amplifying ones own voice, we wanted to create a machine which downplays the users voice. Too often technology is focused on making us feel important, and both Spencer and I thought it would be healthy for the ego if we subverted that trend.
We originally intended on using one contact microphone and one normal microphone. The contact mic was going to be hooked up to the speaker’s throat to capture their voice and their voice only, while the other microphone was going to be placed at a distance from the user to capture their voice and the sound of the room.
We would use audio-processing tools designed for the GNU/Linux operating system, including JACK Audio Connection Kit and Audacity. We intended on piping both of these inputs into JACK, a “virtual plugboard” application, which would let us route the inputs into Audacity, a simple audio editing program both Spencer and I were familiar with.
Our first struggles arose when the contact mic didn’t pick up the speaker’s voice with enough clarity to be useful. On top of this, the USB ADC we were using introduced an unacceptable amount of lag. We agreed that we couldn’t use the contact microphone for our project.
In addition, JACK was originally very difficult to work with. JACK required another audio daemon, Pulseaudio, on the machine we were running. Pulseaudio is notoriously difficult to stop, as it did not respond to shutdown commands from systemd, and was instead being restarted by gdm, a process needed for display on the machine we were using.
Eventually, we found a way for JACK and Pulseaudio to coexist. They provided us with delay times of fractions of a millisecond. However, it was at this point that we realized that Audacity didn’t play well with JACK, and wasn’t even designed for realtime transformation of audio. Audacity was unsuitable for the task at hand. We decided to use Ardour, a much more powerful digital audio workstation application. Unfortunately, we had spent too much time trying to get JACk working and figuring out the feasibility of different microphones, that we did not have time to learn how to learn Ardour.
We migrated to Apple’s OS X and Max MSP. These new tools were much more powerful and suited for the task at hand, but were not libre or gratis. We also decided to use a traditional microphone to capture a user’s voice sounds instead of a contact microphone. We found that even though the traditional microphone was not in physical contact with the user, it picked up a much clearer sound than the contact microphone.
Even with these improvements to our tools and methods, we weren’t able to mix the signals with the appropriate amount of delay and modulation to have the waveforms match up in a way that would actually cancel out the speakers voice. When mixed together, the inverted voice signal and room signal sounded identical to the room signal.
With that experience in mind, we decided to change course. Out projects goal was to try to remove the voice of a different speaker using the same method. Our hope was that the interplay between sound waves from a person’s voice and their own head were making it impossible to properly cancel. We posited that we could remove someone else’s voice instead.
This did not work.
Our first step was to record sounds from both Mics simultaneously in Audacity, the program we had originally intended on using for live audio processing. Our intent was to look at the resulting waveforms from each mic and see the difference in arrival time between one mic and the other. By looking at the difference in time it took the beginning of a clap to hit one microphone vs the other, we could calculate the difference in time between the same sound reaching each microphone.
We then delayed the input of one microphone to account for this delay, and then inverted its signal to create the destructive interference we had hoped for. Sadly, this did not work. Again, our timing couldn’t be precise enough to create the destructive interference, so we merely got a very similar sound to the ambient sound microphone’s input.
Images associated with our project: