Home > travels-of-code-monkey > Monkey Animation Project Part 1: Labeling

Monkey Animation Project Part 1: Labeling

This Friday, we (BlackCatBonifide and I) release our Code Monkey video, comprising 600 photos of a stuffed monkey, traveling across the world at 12 frames per second, to the tune of Jonathan Coulton‘s song, Code Monkey. This week I will post a few essays describing the steps leading up to the final product.

This project has been almost two years in the making, all starting with this forum comment from Colleenky:

OK, this might be a crazy idea. What if I mailed a stuffed monkey to one of you, and you mailed it to another JoCo fan, and so on, until it finally reached JoCo one year from now on JoCo Day 2008? We could set up a Picasa site for pictures of the monkey in every location.

Over the course of the project, Mr. Monkey visited JoCo fans all over the U.S., including Bon and myself in Pittsburgh, and even ventured across the pond to visit fans in Sweden and the U.K. The fans were good enough to document Mr. Monkey’s travels, and upload them to a shared Picasa album. I had the idea almost immediately to put together an animation where Mr. Monkey stays in one place, with a flickering background, à la “Paris Hilton’s Face Never Changes”. I pretty quickly realized though, that most of the pictures being uploaded did not feature Mr. Monkey in the same pose; the distance and viewing angle from the camera varied pretty drastically. This made the problem more difficult, but also much more interesting. Instead of just making a simple gimmick animation where Mr. Monkey stays in one place, I got to create a complex gimmick animation where Mr. Monkey is semi-smoothly animated monkeying his way around the screen!

I wasn’t sure exactly where I was going with this when I started the project. It was exploratory coding, with the vague intention of creating some sort of animation. Many shortcuts were taken. Since I typically only had a couple hours at a time to work on it, I made it a point to take the shortest path to making something cool happen. Given a choice between a dirty hack I could pound out in one session, and doing something “right” across multiple sessions, I did the quick hack every time, and only returned to the more difficult path when and if the hack turned out not to be good enough.

Step 1: Labeling

The first step was to figure out how to record Mr. Monkey’s position, orientation, and scale in each picture. In accordance with my philosophy of doing the simplest, dumbest, thing that might work, my first attempt was to manually use Gimp to measure Mr. Monkey’s (x,y) position and size in pixels, guesstimate his yaw, pitch, and roll, and annotate all of these as a prefix to the filename of each picture. This approach turned out to be a bit too simple to work. I actually did get as far as adding a prefix to all of the pictures specifying Mr. Monkey’s pitch, sorting by filename, and using that as a test animation where Mr. Monkey smoothly turned his head, but jumped around all over the screen. It was a start, but it quickly became evident that this labeling approach was too slow and inaccurate to be practical. The project got put on hold for a while at this point.

Some weeks or months later, I stumbled across Processing. Processing is a programming environment meant to make graphics-programming accessible to non-programmers. It is highly simplified, yet surprisingly capable. While I am a computer engineer by trade, I wasn’t quite motivated enough to learn full-fledged 3d graphics and gui programming for the purposes of this side-project. Processing turned out to be the shortest path to making something cool happening. While I did bump up against some of its limitations, it was capable enough to do what I needed to do, and simple enough to keep the cool-stuff-happening to pain ratio above my screw-this-project threshold.

If I remember correctly, I managed to make the first version of the labeling application and do a first pass at all the labeling in a couple weeknights and a weekend. In the application there is a very simple mock-up of Mr. Monkey’s head. For each picture, I just drag the mock-up head over Mr. Monkey’s real head in the photo, scale it, and rotate it, until it exactly overlaps. Then I hit save, and the path-name to that picture, and Mr. Monkey’s coordinates within the picture, get appended to an index-file.

A lot of people ask me why I didn’t do something more sophisticated for this step. Again, my philosophy was to take the shortest path towards the immediate goal. Yes, it would be cool to use some sort of computer-vision techniques to automatically make at least a first pass over the labels in a fully automated way. However, for the number of pictures that I needed to label (about 600), the time spent building such a system would have far exceeded the time it would have saved me; especially since the results would almost certainly need to be manually double-checked and tweaked anyways.

Another option which I did consider more carefully was to crowd-source the work. In fact, that is one reason I chose Processing: the resulting application can be compiled as a Java applet, which people could then run from a web page without installing any software. From there, I could have probably rounded up some help amongst the JoCo forum members and other friends. If I really wanted to crank up production, maybe I could have actually paid people to do it on Mechanical Turk. Once I got started doing the actual labeling though, I realized it was only going to take me half of a day to just do all of the labels myself. The extra time needed to make the labeling program nice enough for people-who-aren’t-me to use, round up those people, divide the work, get everyone to label in a consistent way and/or double-check the results, and merge the results just wasn’t worth it. That said, if I were going to scale up this project to animations using a lot more photos, this is one of the first features I would consider adding.

That’s all for today. Next post: using the labels to render mock-Mr.Monkey using photographed Mr. Monkey.

  1. November 23, 2009 at 4:05 pm

    That’s just entirely too clever. Being a graphics programmer my first instinct would have been to use some computer vision techniques to try to do it automatically. Simple really is better though.

  2. sporksmith
    November 24, 2009 at 12:12 am

    Thanks! Yep, I might try doing some computer-vision-assist in the future, but I wanted to get the end-to-end animation process working and see whether the high-level concept was worthwhile first.

  1. November 17, 2009 at 5:29 am
  2. April 23, 2010 at 9:58 pm
  3. May 16, 2012 at 1:48 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: