Humanities & Arts Requirement Humanities and Arts Project Title Page Audio Analysis By: Mitchell A. Hunt Humanities and Arts Course Sequence: HU 3900-C05 MU 1611- Fundamentals of Music I.- A- 2011 MU 2611- Fundamentals of Music II.- B- 2011 MU 2722- History of American Pop Music- C- 2012 PY 1731- Intro to Philosophy & Religion- D- 2012 MU 3611- Computer Techniques in Music- A- 2012 MU 2730- Jazz Theory- B- 2012 Presented to: Professor Vincent J. Manzo Department of Humanities & Arts Term C- 2013 HU 3900- C05 Submitted in Partial Fulfillment of The Humanities & Arts Requirement Worcester Polytechnic Institute Worcester, Massachusetts Introduction As you may have noticed when using iTunes, Windows Media Player, or the music player of your choice, they have an option run a “visualizer” while you listen to your music. And whether it is rap, R&B, blues, jazz, country, or hip-hop, you will generally see a type of light or shape that moves across the screen with varying intensity, color, shape, speed, and pulse. Generally, one will find that these visual effects will match (to some degree) the mood or personality that the song has. As well as iTunes and Windows Media Player, I was able to find companies, such as Morphyre, 3D Music Visualisation Software and Hardware, and Vovoid Media Technologies, who have created their own versions of visualizers, as well as many other companies who have their own programs. I was also able to find a few MelodicMatch and Shazam Entertainment Limited are two companies working towards creating music analyzing software to find and compare patterns in music. Prior to this Inquiry Seminar I was introduced to Music Technology through Worcester Polytechnic Institute’s course MU 3611, Computer Techniques in Music under the teaching of Dr. V.J. Manzo. This course provided us with basic skills and understanding in the programs of Ableton Live and Max 6.0, as well as how to use music technology to produce, enhance, and aid in how music is performed and/or experienced. My project is an attempt to create my own visualizer and really try to understand how much work goes into analyzing any sort of audio with the use of the Max 6.0. I choose this project because it ventured outside the realm of the MU 3611 course, while still using the same program and some techniques which I was previously taught. When dealing with Max 6.0 and Ableton in the Computer Technique’s in Music course, we were solely concentrating on and using MIDI data, as opposed to audio files. This project would require me to educate myself in order to utilize many MSP (the area which deals with audio as opposed to MIDI data) functions, and work in the exact opposite direction compared to most of the other projects. Once I had a project idea in my head, I was really excited at first, and when I went to the lab to begin it, I quickly realized how little I knew about MSP, even with my brief Max experience. I would have consulted Max/MSP/Jitter for Music: A Practical Guide to Developing Interactive Music Systems for Education and More by Dr. V.J. Manzo, however because the library didn’t store the book on reserve like I was told they would, I was left to reference the Tutorials and guides within Max 6.0 itself in order to get my project off the ground. I spent the first week of the project simply working on how to use the microphone and other methods to input audio, as well as how to output audio through the speakers. This was easily accomplished using the ezdac~ and ezadc~ functions (shown in Figure 1.), and after some research, the sfplay~ function. With an input message labeled “open” into the sfplay~ function (shown in Figure 2.), the user would be able to select any audio file which he wanted to the patch to analyze if it were in .AIF formatting as opposed to MP3. Figure 1. Figure 2. Once I understood the basics on how to route audio into and out of the program, I needed to learn how to manipulate it in ways that I would be able to pull out information that could be used to create a visualizer. When reading though the examples, how to guide, and tutorials, I noticed the use of two functions called filtergraph~ and biquad~ in order to filter out frequencies you want, don’t want, or create high and low pass filters. When I noticed this, I immediately pulled these functions into my program in order to create a “beat analyzer.” The basic idea of this process would be to filter out all audio except for that of the base drum, or “driving” instrument. When the remaining frequencies passed a certain amplitude/decibel limit, bangs would be sent to a counter, which would count until the amplitude of the audio was lower than the decibel limit, and it would reset back to an initial level. With this, function in place, a selector function would then take the counter data, and every time it began counting, output a bang to the tempo tracker mechanism. Each bang would then be run through the “cpuclock” function in order to get a timestamp on when the bang occurred. Eight of these timestamps would be stored at any given time, and then compared to the timestamp before it to find the time in ms between bangs. This data is then taken and converted to find the average beats per minute over the past eight beats at any given time. The averaging of the previous beats will provide more stability if there were any errors in missing a beat, or double counting one due to wobble in the amplitude in the frequency band across the threshold. The averaging will also make transitions in tempo appear more gradual, which will be especially beneficial if work is done to develop this analyzer into a future patch, which will be described in a later section. However, if there is an error, and a big one at that, than the Beats per Minute Analyzer will be inaccurate for at least the next 8 beats. If the beat analyzer were to be greater developed, then I could think of two different things it would do differently than how it currently works. To begin, right now the bass drum, or frequency at which you will look for the pulse of a song is left to the user to find and define. If possible in the future, the analyzer would be able to find this frequency on its own by comparing different points throughout the entire composition, and finding which one has the most “consistent” and regular pulse, or pattern of pulses, and track that. One idea of how to do this would be to create a function that compares how much variation or deviation the analyzer finds of the amount of time between beats, and stores them in a memory. It will then choose to display the beats per minute for the frequency which has the lowest amount of variation, or most regular pattern. This memory of the would be a temporary one, which can change over the course of a song, because the tempo is not always static, but dynamic like the melody, harmonies, and rhythm. It would be best to determine a number of how many recent beats would be necessary to store, because a number too large would prevent the tempo from staying on the correct frequency, whereas a number too small might change frequencies too sporadically. The next function needed to create an audio visualizer would be some sort of a pitch tracking system. Depending on what notes are used, one is able to determine the key of the song, as well as the chord progression within it as well. Even to an untrained ear, any person would agree that a minor chord sounds different from a dominant seventh, major, or half diminished chord, just to name a few. And depending on what chords have been used prior, will tell us the function of any new chords played, which plays into the overall “mood” of the song. In order to create such an analyzer, this project would need the help of some more filter graphs and biquad filters. With 37 of them set up next to one another running though their own gain level and slider to measure the amplitude, it is possible to show a graph where every not from C3, 130.81 Hz, to C6, 1046.5 Hz. In the program is a function to look at a set of the filters around 230 Hz to determine when there is a peak and when there isn’t. A select function is then in place to send the MIDI note through to a slider when there is a peak. This method is in fact able to tell you if you have a peak, but often times, other frequencies which are not played will peak for some reason or another in the background and produce false MIDI notes. Figure 3. Figure 4. One way to develop pitch tracking function further would be to incorporate a wider span of frequencies, as well as focusing in on ones that are in between the MIDI notes. In order to get rid of the “false peaks” I would like to develop the patch in order to evaluate each note as an average of frequencies that are of some tolerance away from the main frequency in order to get rid of any random peaks due any audio player or recording device malfunctions. I would also like to create a function that will take the peaks, and determine how much larger the peak is than surrounding frequencies. It may even be beneficial to compare each note to not only its immediate neighbors, but their neighbors as well. If working, this system would provide the user the ability to see what key a song is written in. Harmonic analysis, which is probably most often used by jazz musicians and composers all know that if a certain set of notes is being played in chords, then you are able to determine the key of the song, and function of each chord and whether they are tonic, dominant, or subdominant. In a visual simulator, you would expect the more aggressive chords to have faster transitions and brighter or bolder colors, whereas the restful chords will have more gradual transitions and lighter and calmer colors. If the function was also able to determine how much noise and/or notes and instruments are present in a song, than the visualizer might include more “randomness” and faster movements of sorts. When developing the beat analyzer, a new application came to my mind about how one might be able to apply the patch to real life in two ways. This is by no way a fact, but I would guess that 90% of college students my age have some sort of mobile device able to play music, most likely being one of the many apple products or a smart-phone of some variety. Often times, it bothers me when a song of which I am listening to doesn’t match the pace at which I walk, whether it be down my dorm hallway or across campus. It is possible to use a sensor in or on your shoe that works like a pedometer and outputs a signal whenever you take step with one foot. You could take this data and find half the time between each signal in order to find the tempo of your gate. If you were able to play the music you wanted to through a program that was able to change the tempo of your music, you could playback the audio such that it matches the speed of your walk. Taking this even further, it is possible to apply this technology towards running on the street, on a treadmill, or around a track. The program would be able to playback the music at a slightly faster tempo than an athlete’s running pace, which would cause the runner to unconsciously work harder during their workout or race. When I came into this Humanities Seminar looking to work with Max 6.0, I was very unsure of what to expect. Shortly in, we were given the task to come up with a project, and it was difficult for me to come up with one. This project was an idea that came to me in order to work more with Max 6.0, but in a completely new way to me though the use of MSP, which deals with digital and audio data, as opposed to straight MIDI data that I was taught and comfortable with. Over the course of the term, I was independently teaching myself the MSP language, functions, and syntax needed to perform operations with audio data that I had very little knowledge of dealing with. Almost every step of the way, I needed to stop and look up new functions and come up with my own ways to accomplish a few specific tasks. I was not able to come to my original destination of an audio visualizer; however I was able to create a few functions that were stepping stones towards this goal. References Music Analysis Software. MelodicMatch, n.d. Web. 28 Feb. 2013. Smith, Stuart. "Tensions and Chord Function." Jazz Theory. 4th Revised Ed. ed. Lowell: University of Massachusetts, 2004. 57-58. Print. MIDI Note Number and Frequency Table. Tonalsoft Inc., 2005. Web. 28 Feb. 2013. Manzo, Vincent J., Dr. Max/MSP/Jitter for Music: A Practical Guide to Developing Interactive Music Systems for Education and More. New York: Oxford UP, 2011. Print. Virtual-Sound. Ed. Alessandro Cipriani and Maurizio Giri. ConTempoNet, 2010. Web. 28 Feb. 2013. Cycling 74. N.p., n.d. Web. 28 Feb. 2013. "Audio File Playback." CodaSign. Learning, 15 Dec. 2011. Web. 28 Feb. 2013. .