About two weeks ago, I started working again on the Netflix Prize (see: description). Finally, I have a working program and submitted my predictions. I’m pretty happy with the results. I scored a RMSE of 0.9158, which puts me in 211th place. The score would qualify me for the Progress Prize if there weren’t 210 people ahead of me. I have plenty of ideas to improve my score, but I’m not sure how much of an effect they will have.
I have to give major credit to Simon Funk for the excellent description of his algorithm (which I basically copied). It’s very different from my early attempts last Fall. I don’t even remember how the old system worked (I think it was kludged together k-nearest neighbor), but the new system is loosely based on singular value decomposition. More on that later.
Another big change is that instead of working in C#, a language that I am very experienced with, I wanted to try to learn something new. I started with Python, but quickly ran into some memory management problems (I’m sure this is due to my lack of knowledge, and not Python’s lack of features). From Python, I jumped to D — a language I’ve grown to like. It has many of the nice high-level features that I’ve grown accustom to from Java and C#, but it’s also fairly light weight and good with low level stuff.
If you are curious how the system works, keep reading. But first, a recap of what we’re trying to do. The goal is to create a program that can predict how a given user will rate a given movie. Netflix depends on this kind of system to be able to suggest movies that people will actually want to watch, and to steer them away from movies that they aren’t likely to enjoy.
Simon’s algorithm does this by sifting through historical ratings data (over 100 million rows of the form movie id, customer id, rating) and measuring how movies and users correspond with certain features. For example, if there is a feature for violence, some movies will be rated high and some will rate low. By the same token, some users really like violent movies, and some hate them. You can combine this information to make a reasonable guess about how a particular user will rate a particular movie (high violence + love of violence = high rating).
The cool part is that the algorithm is able to figure out what the strongest features are purely based on the ratings data. You don’t need to tell it the genre, how the MPAA rated it, how long it runs, when it came out, if Tom Hanks stars in it, or anything else. You just have to tell it the ratings. If the feature is important, the system will figure it out on its own. The predictions I submitted were based on 60 features. I’ll try to remember to post something about what a few of those features are, but for now: sleep.