Main

Portfolio Assignment 6

Ok. I am posting this description after we have spent a week or more working on this. This was a loosely defined assignment -- there were 2 tasks:

Cluster a bunch of movies

Hierarchically cluster at least 1,000 movies. You can pick whatever attributes you can find--genre, keywords, whatever. You can use imdb or any other source. After you cluster I want you to evaluate the clusters. Do they make sense to people?

Degrees of separation.

Now that we have clusters we have some measure of distance. Movies that are in the same cluster have zero degrees of separation. Movies that share a parent cluster have one degree of separation; movies that share a grandparent cluster have two, etc. Do these measures make sense?

About

A hands-on introductory course on data mining and information retrieval.

Content

Student Blogs

edit SideBar