‘Oljud’ meaning ‘Noise’ in Swedish is a collaboration between Terry Clark and Gustaf Svenungsson. The aim was to create an immersive, interactive audiovisual installation using a 3D Camera and a Digital Audio Workstation to manipulate sounds and then for the audio to effect the visuals.
We believed at first that our audience would have been someone, interested by music technology or digital art, between early teens to 30 years of age and that this would be an installation piece that anyone could try out. However as the project progressed we found that there is a bit of setup time required and a learning curve with the gestures/interactions and thus the target audience changed to someone that would use this as part of their live set up and would spend time in programming certain elements of their music for these specific set of interactions / gestures. Ideally they would be an Ableton user and comfortable exploring programming and setting up.
We learnt about what technologies were available to us particularly wanting to use the Xbox Kinect to capture skeleton information and midi to transmit information to Logic. However, we knew that we would need to gather more information before we proceeded. Our first few weeks were productive as we began researching about different kinds of technologies, libraries and processes we would need to adopt in order to produce the final piece. We found that the Xbox Kinect offered the necessary motion capture elements we needed including a point cloud and human body skeleton tracking information. Additionally, we found that other projects had also used Ableton in conjunction with Open Sound Control (OSC) which provided the ability to communicate over a wireless network. This enabled us to send skeleton information and other triggers between two computers over the same network. This meant that we could distribute the workload and overall processing power. This set the foundation of the installation and we moved further into what data we could collect and how we wanted to display it.
The work was originally split into two halves, Terry worked on the visual part of the installation and Gustaf the audio as we both had previous experience in these areas and felt that we would naturally learn from each other as the project progressed. Splitting the work proved to be of a huge benefit as we were able to rapidly produce prototypes, create a soundtrack and refactor the master code as we went along. Although this was the divide we found that we were constantly looking and altering each others work as it gave an outsider’s perspective on the way we both wrote code and as we moved through the project we become more accustomed to understanding where our particular bugs were coming from. The flowchart below describes the class structure of the project and shows both computer and user interactions.
In order to setup our project you will need the following equipment and software/libraries installed.
- 2 Laptops
- Processing 2 for Computer 1 code (due to SimpleOpenNI)
- Processing 3 for Computer 2 code
- Xbox Kinect V1 – 1414
- Ableton Live
- LiveOsc (An Ableton Hack)
- oscP5 (processing library for sending and receiving OSC messages)
- SimpleOpen NI (The processing library)
- Minim (A processing Library)
- Soundflower (for routing audio into processing
The initial concept involved midi messages being sent to logic. After some successful prototyping and discussion in class we came across OSC which for our purposes was better to use. This led to abandoning Logic in favour of Ableton live because while Logic is a great sound studio built DAW, it drained a lot of resources from the computer and it’s more traditional, linear workflow proved cumbersome. Ableton on the other hand proved more useful on account of being faster and having a built in workflow of organising clips within scenes (i.e. a clip is a music part and a scene is a music section of a piece). This made it easier to abstract the structure of the messages. OSC allowed us to be flexible about how much workload we would put on each computer.
Since the Kinect data and running the music software would be the two most processing intensive “things” it was decided those would be split between the two computers. One running Kinect, one running the DAW. Apart from that OSC allowed us to run the other code as we needed. For example, one computer is running beat detection on the audio, the result of the analysis can easily be sent via OSC an message to the a second computer.
Writing OSC messages proved a lot more intuitive since we could mix & match messages that had been pre-defined by the live OSC api with our own custom messages. Because of our schedule of trying to stay in sync and having a prototype up Gustaf ended up writing a first draft of the particle system with two important extra features:
Making our own mini projects to present our ideas allowed us to be both understand the direction we both wanted to go in. Terry created a Kinect visual and started by experimenting with the SimpleOpen NI library along with checking out youTube videos, blogs and books to find example code in order to learn more about how others tracked skeleton information from the Kinect and this formed the basis of our project.
Some of the visual references were found on youtube listed below:
However the below video captured our attention and we decide to try and recreate a particle system visually whilst also creating and interactive musical piece.
The most important factors for the visuals were that:
- It needed to be scalable. We were needing as much performance as we could get since we knew we would be pushing processing so we tried out a number of ways to reduce particles being drawn such as:
- if (frameRate < “x”) only create every 4th particle
- if (particles.size > “x”) start deleting the last particle of the arrayList (i.e. the oldest)
- if (millis() % = 2) you are allowed to create new particles
However, we found that after tweaking the depth of that the particles were drawn and the lifespan to decrease faster this allowed for a faster framerate drawn to screen. Because we wanted to attach the particle system to a pointcloud, the code was modified so that the origin point of any given particle was defined by an array of PVectors and our arrayList of particles would whenever asked would create one particle at each vector.
The gestures and interactions came about three quarters of the way through the project once we had understand completely how the Kinect used hand, elbow and shoulder vectors. It was then about finding the distance between these joints which then activated certain functionality such as playing and changing the section, then entering ‘Beat’ mode.
We explored the possibility of having hand gestures instead of body gestures, but found that the close proximity needed for the Kinect to correctly analyse the hand shape was a compromise we needed to decide upon. Furthermore the extra processing power meant that other parts of the visual would lag.
Thus we opted for a more obvious gesture selection as highlighted below:
Throughout the project we needed to overcome a variety of challenges. For Example, when trying to implement hand gestures we found that the user needed to be in close proximity to the Kinect in order to capture individual finger movements. This then created an issue with lagging within the point cloud particle system due to overloading the graphics card.
Our user testing also gave us a deeper insight into our product and that there was a bit of a learning curve when they were trying to interact with it. This led us into changing out target audience and we felt now that maybe now the UI was not a necessary component for an artist when performing live. Mapping the vector information was another challenge as we needed to test the maximum distance x, y and z that the kinect could go and then we decided later to map this to the middle of the person in order to allow a freer movement from the user.
Another slight issue, which we believe to be a fundamental problem with 3D tracking ,was that the Kinect kept dropping the user and thus this made it difficult for the user to feel fully immersed as attention would then be on trying to reconnect.
While the particle system was straightforward to setup we found that since it was being mapped quite a few points, not overloading computer required some tinkering (we ended up solving this by skipping points of the pointcloud). Furthermore to make the particle system look appealing required a lot of tinkering. It needed to look busy and detailed (lots of particles being drawn) while being visually coherent and not just a mess.
The audio challenges where twofold, technical and “artistic”.
Figuring out what type of music the user would interact with proved went through numerous phases. A person who’s not musically trained wouldn’t just be able to pick up and have fun with a theremin like setup of mapping x, y coordinates to pitch and volume. It turned out that giving the user any direct control over pitch demanded that they understood the music piece as a whole, once again this did not fit with our aim of being intuitive, fun and immersive people regardless of skill level. The second challenging aspect once we had decided to let the user switch between sections and manipulate parameters within ableton live was to have a piece of music which it made sense to play around with it. For example when a musician brings his effects and pedals to perform at a concert he won’t alter the parameters of said effects between min and max constantly. There will be a very specific range of sounds that makes sense for different sections of different songs. We found trying to recreate that experience made the most sense.
The first iterations of the project had midi and logic in mind. Logic while a great sounding piece of software required far too much computing power to pull off what we wanted. We opted to not use midi on account of doing anything more complex than noteOn/noteOff requiring reference tables. Using the LiveOSC api and their easy to read and understand documentation meant we could write code that itself read meaningfully i.e:
The difficulty then was with understanding the what values different parameters took. some accepted floats from 0 – 1, others integers between 0 – 127 while other more rhythmically oriented parameters wanted a sudvision such as: 1/4, 1/16, 1/32 ,1/ 64 etc…
Having someone test the program while another person sat behind the screens proved to be very useful since the user can feel something is not responsive while you can clearly see the parameters moving up and down on your end.
We feel the project ended up as a mixed success. While we set out to craft a tool that allows “anyone” to have a meaningful interaction with music, it became clear to us that “anyone” doesn’t actually exist and that people have radically different expectations on what they can do and how they can interact with a piece of technology such as the Kinect.
After testing it with different people we found that people fundamentally had two different reactions:
- People who saw it as an audiovisual piece with which you could recreate the feeling of “dropping the bass” or “the dub-step breakdown”. These where normally the audience who could already relate to our aesthetics.
- Those that didn’t necessarily connect to our musical or visual choices directly but saw it as more of a high concept tool for purposes such as music therapy.
Because of our music backgrounds we were more interested in the piece as being purely sonically oriented. This meant we decided to focus more on people who already had knowledge of electronic music and who already interact with music to some capacity. It’s not intended for “experts” or necessarily professionals but for “hobbyists”. We think that aesthetically and musically we successfully completed what we set out do which was to have an interactive experience using technology which we had no previous experience with and furthering our knowledge of vectors and data transferal.
However we feel we could’ve done a much better job with making the piece easier to use in terms of calibration and gestures. The biggest issue with software is that it is in it’s current state complicated to use. You need to be told what the gestures are and even then they require too much training to internalize. Spending more time fine tuning the music and the gestures, combining, removing and making sure the gestures flowed better from section to section. We also feel that our kinect connection is currently too unreliable. It frequently loses tracking of the user. Another library, programming software and possibly kinect v2 may help us fix this. We also wanted to make something more interesting with the visuals. We did set out to make a pointcloud and draw a particle system on top of it. However we feel that the visuals would need to be even more dynamic to hold the user’s attention. This would’ve been done by inserting more interactions with the fft such as further physics and colour manipulation.
Kinect v1 Skeleton
OSC & NetP5
ParticleSystem = Part of Previous project, further alteration through advice in class
Making Things See: 3D vision with Kinect, Processing, Arduino and MakerBot