Main

Syllabus

Data Mining and Information Retrieval

General Information

Cross-listed as MSCS 570U, CIST 471U, CPSC 470U

Meeting times and location: Thursday 6pm-8:45 CGPS North 210

Credits: 3

Course website: http://www.zacharski.org/classes/2009/spring/cs470u/index.php

Instructor

Ron Zacharski
Trinkle B20
raz_AT_umw.edu\\ googletalk: ron.zacharski

Office hours

Monday: 11-12; 2-3:30
Tuesday: 11-12:30
Wednesday: 11-12:15
as well as before and after class

Required Materials

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems) by Ian H. Witten and Eibe Frank. 2005.

Programming Collective Intelligence: Building Smart Web 2.0 Applications by Toby Segaran. 2007.

You will also need access to a computer with the open source data mining toolkit weka and the programming language python. I am also hoping that a sufficient number of people will bring laptops to class so we can work on problems in teams during class.

Teams

During the first day of class, all students will be assigned to permanent teams. Throughout the course, teams will both take team tests and participate in joint activities. Team performance will be one component of your final grade.

Readiness Assessment Tests

There will be approximately six short multiple-choice Readiness Assessment Tests (RATs) given during the course. Each test will be taken individually and by team. These tests are typically closed book. Unless otherwise specified I will allow 1 page of notes.

Here's the scoop. I want people to read the assigned readings before class and I am using these tests (as well as directly calling on individuals) as an incentive to do the readings. My intention is that if you read through the material and took a few notes you should be able to do quite well on the test. If I ask a tricky question or a trivial question, your team can appeal it.

Portfolio

Throughout the course there will be approximately 12 to 15 assignments. These range in complexity from programming in Python to working with the Weka toolkit in analyzing datasets. Many of these are intended to be done in groups of 2-5 people. When work is done in a group it is intended that each person in that group understands the work and can explain it to the class.

The portfolio will take the form of a blog. The blog entries will provide a brief statement of what you have done as well as as explain results. Typically, blog entries will contain links to relevant code and output.

In-Class Participation

Part of your grade is based on in-class participation. Sometimes I may ask a question to the entire class and you can volunteer to answer. At other times I will randomly call on you directly.

Final

Grading criteria and grading weights

Grading is on a straight scale. The grade reflects your achievement regardless of the performance of other students in the class. There is no curve. 93 and above is an A+; 90-92 an A; 80-89 a B; 70-79 a C; 65-70 a D; and 64 and below an F. If everyone in the class does poorly on a particular item (test, or assignment), we will identify the problem and determine a remedy.

The grades will be determined by scores in three areas: individual performance, group performance, and group participation (as determined by peer evaluation). The percentage of the grade that is based on each area will be determined by representatives of each student team during the first class. The procedure is as follows:

  1. Each team sets preliminary weights by filling in the blanks in the table below and selects a representative for their group.
  2. Team representatives will meet at the front of the room and develop a consensus about the grade weights for the entire class.
Grade Categoriesweight within areatotal weight
1. Individual Performance 55 %
1.1 Individual Tests (min 10%)__% 
1.2 Portfolio (min. 20%)__% 
1.3 In-class Participation__% 
1.4 Final _______% 
 100% 
   
2. Team Performance 35 %
2.1 Group Tests (min 20%)__% 
2.2 In-Class Labs (min 20%)__% 
 100% 
   
3. Team Participation (peer eval) 10%
  100%

Academic Integrity

I assume you are an ethical student and a person with integrity. I expect that you will follow the university honor code (see http://rosemary.umw.edu/CSHonorCode.html). Please use common sense and ask yourself what would a person with integrity do? To help you, I would like to make three comments related to this:

Plagiarism

Plagiarism means presenting some other person's work as your own. This can mean using some other person's words without acknowledging their source, or using some other person's ideas. Copying another student's work (homework or exam) is also plagiarism. Plagiarism will result in an automatic zero for that submission.

Collusion

Collusion is unauthorized collaboration that produces work which is then presented as work completed independently by the student. Collusion includes participating in group discussions that develop solutions which everyone copies. Penalties for plagiarism and collusion include receiving a failing grade for the course.

Classroom behavior

I ask that you respect the other people in the class. I recognize that your life circumstances may require you to receive cell phone calls during class. If this is the case please set your cell phone on vibrate and discretely leave the class to accept calls. During tests, if you walk out of the classroom, or consult/display your cell phone, I will assume you are done with the test and collect your grading sheet

Privacy and Confidentiality

I recognize that students deserve as much privacy as possible. I will not share your work (tests or assignments) with others without your permission.

Accommodations for students with special needs

Any student with a documented disability may receive a special accommodation to complete any requirements of this course. If you are have a disability or believe you have one you may wish to self-identify. You may do so by providing documentation to the Office of Disability Services located in Room 203 of George Washington Hall (Phone: Voice 540-654-1266, Fax: 540-654-1163). Appropriate accommodations may then be provided for you. If you have a condition that may affect your ability to exit the premises in an emergency or that may cause an emergency during class, you are encouraged to discuss this in confidence with me and/or anyone at the Office of Disability Services. This office can also answer any questions you have about the Americans with Disabilities Act (ADA).

About

A hands-on introductory course on data mining and information retrieval.

Content

Student Blogs

edit SideBar