An Echoic Chamber

My final project was an extension of my earlier experiments with creating augmented audio spaces. Originally, I had hoped to create an space with odd echoic properties, where certain frequncies were echoed but others were not. I decided this wouldn’t be compelling (or wouldn’t work, which in truth was my biggest concern). I decided to keep the idea of interesting echoic properties though, and, inspired by a recent viewing of the hunger games, decided to mimic the “Mockingjays” in the first movie which, genetically modified to be good at picking up tunes, would mimic a tune and spread it through the forest.

Conceptually, this is more of an interactive experience than an static piece. It inherently involves participation. Having this interactive element is something I had wanted from the beginning of the semester, when Spencer and I had our failed attempt to cancel out someone’s voice. In a round about way, my final project ended up amplifying a person’s voice. I thought it was a pleasant change.

I used Puredata to play sounds of the forest, and measure the pitch of a note and its loudness. If something was heard which was high pitched and loud, it was assumed to be a whistle, which would trigger a recording. This sound was then played back through a variety of filters which would play it at different speeds and pitches to mimic a forest of birds copying the sound.

WordPress won’t let me post my PD file, so here’s a picture of it.
Screenshot from 2014-12-19 15:23:01


I Am Sitting in Cyberspace

For my Objectified, Electrified Sound Project, I took inspiration from Alvin Lucier’s “I Am Sitting In A Room” piece, in which he succesively re-recorded the sound of his voice being played into a room. My group, Noise, dealt heavily with the sounds you often don’t think you want – for Lucier, a recording of his voice should have his voice in it, not the resonant frequencies of the room.
With the transition from analog recording of music to digital recording, I wanted to bring this piece into the 21st century by putting a modern twist on the same idea. Instead of recording my voice being played into a room, allowing the room’s acoustics to make my voice incomprehensible, I took a digital recording of my voice and re-converted it into different lossy audio codecs, and repeated the process until each audio file no longer contained a recording of my voice, but instead contained the inherent aspects of the compression algorithm.

This process would have been impossible if done by hand, so I had to automate it. I wrote simple bash scripts to do the process of converting the audio and saving it with a name corresponding to which iteration of the loop we were currently on.

The main difficulty, which stopped me from making any progress for about a week, was a strange issue with the conversion software I was using. For whatever reason, a few fractions of a second of audio would be chopped off the end of the file each time it was converted, making the audio file empty by the 50th iteration. My solution to this was less than elegant – I tacked on a minute of silence at the end of the original recording and let the conversion software eat away at that.

Overall, I found this to be a very depressing experience. The loss in quality, though not consciously perceptible for the first few conversions, became painfully obvious after only a few more iterations. I don’t like that this degradation in our listening experience has become commonplace.

Because this was an iterative process, I have about a thousand audio files. I will be playing a selection of them tonight.

Sonic Cinematic: Lack Track

This project involved removing the non-diegitic audio from a non-diegitic-heavy clip of popular sitcom, “The Big Bang Theory”. By splicing out the unnatural audio, I hoped to create a decidedly less natural interaction than the one I started with. We are inundated with laugh tracks in popular television, and rarely is it representative of the real world. However, since the scene is created with the laugh track in mind, excising the offending audio does little to make the scene more genuine. This underlying disingenuity is what I was striving to bring out.

I spent most of my time looking up ways to remove this audio. Most audio production experts agree that if the laugh track is mixed in to the channels, there is little you can do. My longest-running attempt at isolating and removing the laugh track involved splicing the left and right tracks into their own mono tracks, and inverting one, and recombining them. This resulted in the audio which wasn’t in both the left and right track of the recording. Traditionally, laugh tracks are not mixed into both tracks fully, so was possible to isolate the laugh track. My hope was inverting that isolated laugh track and combining it with the original audio would remove the laugh track, but it did not. However, I got a nice 30 seconds of canned laughter from it, so I was happy. I then tried applying various band path filters over the audio to remove the laughter, but this didn’t work either.
I ended up taking the tedious and inexact route and removed the laughter by hand. This resulted in total silence during these portions of the audio.

Selectively removing audio in Audacity:

Screenshot from 2014-10-13 17:19:46

Research into removing laughs

Screenshot from 2014-10-13 09:43:07

Screenshot from 2014-10-13 09:42:23

An earlier attempt in Ardour

Screenshot from 2014-10-09 18:58:54

I came across a program for Windows, a directshow filter, which claimed to be able to do what I wanted, and, If I had more time,  I’d love to try it. I also would want to try opening the audio up in SPEAR to try removing the laughter selectively that way.  I’d also want to add some static to the completely silent portions of the audio to make it sound more realistic.

HEAR (HEAR Enables Autophony Reduction)

HEAR (HEAR Enables Autophony Reduction)

Adam Arsenault & Spencer Liberto

Artist Statement:

Autophony is the action of hearing your own self-generated sounds, such as hearing your own voice or hearing your own blood circulation. For our project, we wanted to reduce autophony. Specifically, our work sought to remove the sound of ones own voice through the use of delays and wave inverters, mixing the inverted wave form of a person’s voice with ambient sound, hopefully cancelling out a person’s voice while retaining all other sound.

This idea came about during the research process, when Adam was in the sound/machine group. I had thought about ways that we interacted with ambient noise through technology. I became interested in noise-cancelling headphones. I wanted to invert the purpose of noise cancelling headphones. Instead of reducing ambient noise and amplifying ones own voice, we wanted to create a machine which downplays the users voice. Too often technology is focused on making us feel important, and both Spencer and I thought it would be healthy for the ego if we subverted that trend.

Iteration 1:

We originally intended on using one contact microphone and one normal microphone. The contact mic was going to be hooked up to the speaker’s throat to capture their voice and their voice only, while the other microphone was going to be placed at a distance from the user to capture their voice and the sound of the room.

We would use audio-processing tools designed for the GNU/Linux operating system, including JACK Audio Connection Kit and Audacity. We intended on piping both of these inputs into JACK, a “virtual plugboard” application, which would let us route the inputs into Audacity, a simple audio editing program both Spencer and I were familiar with.

Our first struggles arose when the contact mic didn’t pick up the speaker’s voice with enough clarity to be useful. On top of this, the USB ADC we were using introduced an unacceptable amount of lag. We agreed that we couldn’t use the contact microphone for our project.

In addition, JACK was originally very difficult to work with. JACK required another audio daemon, Pulseaudio, on the machine we were running. Pulseaudio is notoriously difficult to stop, as it did not respond to shutdown commands from systemd, and was instead being restarted by gdm, a process needed for display on the machine we were using.

Eventually, we found a way for JACK and Pulseaudio to coexist. They provided us with delay times of fractions of a millisecond. However, it was at this point that we realized that Audacity didn’t play well with JACK, and wasn’t even designed for realtime transformation of audio. Audacity was unsuitable for the task at hand. We decided to use Ardour, a much more powerful digital audio workstation application. Unfortunately, we had spent too much time trying to get JACk working and figuring out the feasibility of different microphones, that we did not have time to learn how to learn Ardour.
Iteration 2:

We migrated to Apple’s OS X and Max MSP. These new tools were much more powerful and suited for the task at hand, but were not libre or gratis. We also decided to use a traditional microphone to capture a user’s voice sounds instead of a contact microphone. We found that even though the traditional microphone was not in physical contact with the user, it picked up a much clearer sound than the contact microphone.

Even with these improvements to our tools and methods, we weren’t able to mix the signals with the appropriate amount of delay and modulation to have the waveforms match up in a way that would actually cancel out the speakers voice. When mixed together, the inverted voice signal and room signal sounded identical to the room signal.

Iteration 3:

With that experience in mind, we decided to change course. Out projects goal was to try to remove the voice of a different speaker using the same method. Our hope was that the interplay between sound waves from a person’s voice and their own head were making it impossible to properly cancel. We posited that we could remove someone else’s voice instead.

This did not work.

Our first step was to record sounds from both Mics simultaneously in Audacity, the program we had originally intended on using for live audio processing. Our intent was to look at the resulting waveforms from each mic and see the difference in arrival time between one mic and the other. By looking at the difference in time it took the beginning of a clap to hit one microphone vs the other, we could calculate the difference in time between the same sound reaching each microphone.

We then delayed the input of one microphone to account for this delay, and then inverted its signal to create the destructive interference we had hoped for. Sadly, this did not work. Again, our timing couldn’t be precise enough to create the destructive interference, so we merely got a very similar sound to the ambient sound microphone’s input.


Images associated with our project:

Screen Shot 2014-09-21 at 9.18.24 PM Screen Shot 2014-09-21 at 10.12.18 PM Screen Shot 2014-09-22 at 3.40.38 PM Screen Shot 2014-09-22 at 3.46.49 PM Screen Shot 2014-09-22 at 4.05.45 PM Screen Shot 2014-09-22 at 4.05.55 PM Screen Shot 2014-09-22 at 4.35.08 PM