archive for March, 2007

ApeJet Game: Clicker

I’ve put together this little game that I’m calling Clicker.  The rules and the goal are simple, but it should keep you busy for a while.  I think there’s another game that works the same way, but I can’t remember.  If anyone recognizes it, let me know if you know.  As always, comments and suggestions are welcome.

Update 12 Mar. 2008 – I finally (nearly a year later) figured out what my game is similar to!  It’s called Lights Off.  I still think I remember seeing a hand-held, electronic, toy version before.  If anyone knows what it’s called,let me know.

Conversive Chat – Beta

Conversive (where I work) recently released a new beta product — Conversive Chat.  It’s a hosted AJAX web chat and FAQ management system.  I’m pretty happy with the features that we’ve been able to put together in such a short time.  Since it’s hosted, you don’t have to install anything on your web server, just add a link from your site.  The AJAX means that you and your users don’t need to download or install anything (not even Java or Flash).  The really cool part is the FAQ integration.  You can build FAQs manually, or you build them directly from your chat conversations.  The integration works both ways; if you’re chatting with someone and they ask you a FAQ, the system will automatically suggest the answer for you.

Since this is still in beta, we’re really interested in getting people to try it out and give us feedback.  If you run a site, please check it out and consider using it.  If you do use it, let us know what you think.

Netflix Prize: Sample Features

As promised, here are some sample features that my program detected. Read it as left column vs. right. See if you can come up with a human name for each feature.

Feature 1
Dragon Ball Z: Majin Buu
Dragon Ball: Fortune Teller Baba Saga
ECW: Wrestlepalooza ’97
Dragon Ball Z: Imperfect Cell Saga
Battle Athletes Victory: Vol. 2: Doubt and Conflict
Dune: Extended Edition
Mobsters and Mormons
Decasia: The State of Decay
Hockey Mom
Ah! My Goddess
Feature 2
Lost in Translation
Eternal Sunshine of the Spotless Mind
The Royal Tenenbaums
Dogville
Punch-Drunk Love
Pearl Harbor
The Wedding Planner
Armageddon
Coyote Ugly
Maid in Manhattan
Feature 3
Friends: Season 5
The Best of Friends: Season 3
The Best of Friends: Vol. 2
The Best of Friends: Season 4
Friends: Season 4
Freddy Got Fingered
Wake Up, Ron Burgundy
House of 1,000 Corpses
The Saddest Music in the World
Club Dread
Feature 4
Fox and His Friends
Sports Illustrated Swimsuit Edition: 2002
Dragon Ball Z: Vol. 17: Super Saiyan
Michael Moore Hates America
Celsius 41.11
Queer as Folk: Season 2
The Hours
Fahrenheit 9/11
Bowling for Columbine
Queer as Folk: Season 1
Feature 5
D.E.B.S.
Birth
Tempo
In the Cut
Alexander: Director’s Cut
Tommy Boy
Ace Ventura: Pet Detective
Rocky
National Lampoon’s Vacation
Caddyshack
Feature 6
Crash
Vanilla Sky
The Jacket
The Village
Love Actually
Death Wish 5: The Face of Death
Beethoven’s 2nd
Air Bud: Golden Receiver
Air Bud: World Pup
Pokemon: The Movie 2000
Feature 7
Madonna: The Video Collection 1993-1999
Moulin Rouge: Bonus Material
Pirates of the Caribbean: The Curse of the Black Pearl: Bonus Material
Anchorman: The Legend of Ron Burgundy
Dodgeball: A True Underdog Story
2001: A Space Odyssey
The Deer Hunter
The French Connection
Apocalypse Now
Star Trek: Voyager: Season 5
Feature 8
Showgirls
Very Bad Things
Indecent Proposal
Fatal Attraction
Jo Jo Dancer, Your Life is Calling
Stargate SG-1: Season 4
Stargate SG-1: Season 3
Lord of the Rings: The Fellowship of the Ring
Stargate SG-1: Season 6
Stargate SG-1: Season 2
Feature 9
Joe Versus the Volcano
Men in Black
Earth Girls Are Easy
Tank Girl
Galaxy Quest
Dragon Ball Z: Broly: The Legendary Super Saiyan
Dragon Ball Z: Super Android 13
Friends: Season 7
The Best of Friends: Vol. 4
Curb Your Enthusiasm: Season 1
Feature 10
The Passion of the Christ
The Office: Series 1
Elf
Napoleon Dynamite
Barbershop
Queer as Folk: Season 2
Queer as Folk: Season 1
Queer as Folk: Season 3
Buffy the Vampire Slayer: Season 1
The Best of Friends: Season 2

Netflix Prize: Update 2

About two weeks ago, I started working again on the Netflix Prize (see: description). Finally, I have a working program and submitted my predictions. I’m pretty happy with the results. I scored a RMSE of 0.9158, which puts me in 211th place.  The score would qualify me for the Progress Prize if there weren’t 210 people ahead of me.  I have plenty of ideas to improve my score, but I’m not sure how much of an effect they will have.

I have to give major credit to Simon Funk for the excellent description of his algorithm (which I basically copied).  It’s very different from my early attempts last Fall.  I don’t even remember how the old system worked (I think it was kludged together k-nearest neighbor), but the new system is loosely based on singular value decomposition.  More on that later.

Another big change is that instead of working in C#, a language that I am very experienced with, I wanted to try to learn something new.  I started with Python, but quickly ran into some memory management problems (I’m sure this is due to my lack of knowledge, and not Python’s lack of features).  From Python, I jumped to D — a language I’ve grown to like.  It has many of the nice high-level features that I’ve grown accustom to from Java and C#, but it’s also fairly light weight and good with low level stuff.

If you are curious how the system works, keep reading.  But first, a recap of what we’re trying to do.  The goal is to create a program that can predict how a given user will rate a given movie.  Netflix depends on this kind of system to be able to suggest movies that people will actually want to watch, and to steer them away from movies that they aren’t likely to enjoy.

Simon’s algorithm does this by sifting through historical ratings data (over 100 million rows of the form movie id, customer id, rating) and measuring how movies and users correspond with certain features.  For example, if there is a feature for violence, some movies will be rated high and some will rate low.  By the same token, some users really like violent movies, and some hate them.  You can combine this information to make a reasonable guess about how a particular user will rate a particular movie (high violence + love of violence = high rating).

The cool part is that the algorithm is able to figure out what the strongest features are purely based on the ratings data.  You don’t need to tell it the genre, how the MPAA rated it, how long it runs, when it came out, if Tom Hanks stars in it, or anything else.  You just have to tell it the ratings.  If the feature is important, the system will figure it out on its own.  The predictions I submitted were based on 60 features.  I’ll try to remember to post something about what a few of those features are, but for now: sleep.