Netflix Prize

Netfilx announced yesterday a contest called the Netflix Prize. The challenge is to create a system that, based on past data, can accurately predict how a customer will rate a movie. They currently have a system to do this called Cinematch. This is what they use to make movie recommendations. Better recommendations mean happier customers. Happier customers means more money for Netflix. Oh, and the prize part: if you can out predict Cinematch by 10%, you win $1,000,000 (assuming nobody else does even better).

For the contest, Netflix is supplying over 100 million ratings covering 17,770 movies and nearly half a million customers. For infovores, it’s a gold mine. I’m going to go ahead and try my hand at the prize. I have a reasonably good idea of how to go about creating a ratings prediction system. I highly doubt I’ll be able to come up with anything better than what the Netflix engineers have put together, but it’ll be fun to try.

So far, I’ve pulled all of the data into a nice little (Ha!) MySQL database. My next steps will be to create the tools I’ll need to build and test a very basic prediction system. In the mean time, here are some fun little facts:
* Average movie rating: 3.23
* Average number of ratings for movies: 5655
* Average customer average movie rating average: 3.67
* Average number of ratings for customers: 209