General
Information Lecture Notes
ML Links
Assignments
Project
Announcements:
- To create a group for the final project go
to: http://kodiak.cs.cornell.edu/kddcup/cs578/register.html
- To submit predictions for the final project
go to: http://kodiak.cs.cornell.edu/kddcup/cs578/submission.html
- To view results on the 1st 100 test cases
go to: http://kodiak.cs.cornell.edu/cgi-bin/cs578/newtable.pl
- The Final Project is due Tue Dec 12 at 11:59PM (for
predictions) and Thu Dec 14 at 10:00am for write-ups. You
can download the data for the final project here. Good luck on the
project!
-
- NOTE: We're moving the prelim to Upson B17
so that there is more room for you. Prelim starts at 6pm.
- By popular demand, the 2nd prelim is Thursday night Nov
30 and will start/end early 6:00pm-7:45pm in Upson B17. Please don't
be late. Laptop computers or other potentially wireless devices may
not be used during the exam.
- The take-home midterm was handed out on Thu Nov 9
and is due promptly at 2:55PM Thu Nov 16. Late midterms will not be
accepted. Please be sure to sign your midterm stating that you adhered
to Cornell's Academic Integrity Policy. The text for the midterm is here
if you want to use it when preparing your answers.
- As announced in class, HW3 is now due Tue Nov 7 at
the beginning of class.
- The take-home mid-term exam probably will be handed
out Thu Nov 9.
- HW3 now available (see below). HW3 is due in 2
weeks on Thu Nov 2 at the beginning of class.
- 2006-10-13: Explanation of experimental design principles posted under Lecture Notes.
- HW2 now available (see below). HW2 is due in 3
weeks on Tue Oct 17 at the beginning of class.
- 2006-09-13: Art's Monday office hours have changed. Hopefully this is the last change...
- 2006-09-06: Files from UNIX introduction session posted under Lecture Notes.
- 2006-09-06: Tips for installing IND / unixstat on cygwin posted under Assignments.
- HW1 now available (see below). HW1 is due in 2 1/2
weeks on Tue Sep 19 at the beginning of class.
-
Welcome!!!
Time
and Place
2:55 PM to 4:10 PM
Tuesdays & Thursdays
205 Thurston Hall
|
|
Email (@cs.cornell.edu)
|
Office
Hours
|
Office
|
Instructor
|
Rich Caruana
|
caruana
|
Tue 4:30 - 5:00
Wed 10:30-11:30
|
Upson 4157
|
Teaching Assistant
|
Art Munson
|
art |
Mon 10:00-11:00am
Fri 1:30-2:30
|
Upson 5156
|
Teaching Assistant
|
Yisong Yue
|
yyue
|
Tue 4:30-5:30
Wed 11:00-12:00
|
Upson 4154
|
Teaching Assistant
|
Alex
Niculescu-Mizil
|
alexn
|
Thu 10:30-11:30
|
Upson 5154
|
Administrative
Assistant
|
Melissa Totman
|
mtotman
|
M-F 9:00-4:00
|
Upson 4147
|
Go to top
Course
Description:
This implementation-oriented course presents a broad introduction to
current algorithms and approaches in machine learning, knowledge
discovery, and data mining and their application to real-world learning
and decision-making tasks. The course also will cover empirical methods
for comparing learning algorithms, for understanding and explaining
their differences, for exploring the conditions under which each is
most appropriate, and for figuring out how to get the best possible
performance out of them on real problems.
Textbooks:
Machine Learning
by Tom Mitchell
Optional references:
The Elements of
Statistical Learning: Data Mining, Inference, and Prediction by T. Hastie, R. Tibshirani,
J. Friedman.
Pattern Classification 2nd edition
by Richard Duda, Peter Hart, & David Stork
Pattern Recognition and Machine Learning by Christopher Bishop
Grading policies:
-
20%
Midterm (take home)
-
20%
Final (open book in class)
-
30%
Assignments (individual)
-
30%
Final Project (group project comparing various learning methods)
-
Bonus
points for class participation
-
Homeworks,
the take-home mid-term, and the final exam must be your own work. For
homework, it is OK to talk with other students about the assignment,
ask each other questions, and in general learn from each other. But the
homework you hand in must be your own work. If other students gave you
significant help with your homework you should briefly acknowledge them
in what you hand in.
Academic
integrity policy
Go to top
- Intro Lecture (CS578.06_INTRO_lecture.4up.pdf)
- Decision Tree Lecture (and t-test mini-lecture) (CS578.06_DT_lecture.ppt.pdf)
- UNIX Introduction files:
- Performance Measures Lecture (performance_measures.pdf)
- Experimental Design Information
- KNN Lecture (CS578_knn_lecture.pdf)
- Feature Selection / Missing Value Lecture (CS578_featsel_missing_lecture.pdf)
- Bagging, Boosting, Random Forests, and Ensemble Learning (CS578.bagging.boosting.lecture.pdf)
- SVM lecture (long notes) (short
notes you are responsible for)
- Clustering lecture (responsible up to slide 34 "Mean Point
Happiness" for Prelim 2) (cs578_clustering_lecture.pdf)
Go to top
Homework 3
Download HW3 here: 578.hw3.2006.tar.gz
Homework 2
Perf code for calculating ROC performances: http://kodiak.cs.cornell.edu/kddcup/software.html
Download HW2 here: cs578.hw2.tar.gz
Homework 1
Download HW1 here: cs578.hw1.tar
IND decision tree code for MacOS: ind.macos10.3.tar
UNIXSTAT utility code for MacOS: unixstat.macos10.3.tar
Tips for installing IND / unixstat on Cygwin
Go to top
Go to top
Go to top