Information
Where to get all official information
We plan to use lecture and this homepage as the main distribution points of information, and crucial time-sensitive announcements will be announced either in lecture or via email. (Hence, we won't use CMS for announcements, and you don't absolutely need to look at Piazza, although we recommend signing up for and monitoring the course Piazza page, whose link is given above.)
Course time and location
Tuesdays and Thursdays, 10:10-11:25am, Hollister B14
Final project ("competition") due date
The registrar-determined due date is Monday May 11th at 4:30 pm.
Getting help or talking to someone
Feel free to ask questions on Piazza, where multiple people can participate and benefit! But we also have a great couse staff to assist you, as well. Office hours are listed below. (None held during official Cornell breaks unless otherwise noted.)
If you have a question for a professor, then it's best to send a single email with both of us in the To: line (LJL2 and KS999), so that we can keep track of who's asked and answered what (and how).
Name | Contact info (**=@cs.cornell.edu; *=@cornell.edu) | Office Hours | Languages | |
---|---|---|---|---|
Prof. Lillian Lee | llee** 607-255-8119 419 Gates Hall |
Appointments by easy auto-sign up here | Just starting out with numpyand matplotlib in Python; just starting out in R; don't laugh, but my favorite languages and tools are awk and gnuplot. I hope I've demonstrated in lecture that it's quite easy to pick up what you need in other languages, (sometimes in as little as one (long) night - but don't you wait until the last minute if at all possible!), as long as you understand the core concepts of the class and carefully read the documentation. |
|
Prof. Karthik Sridharan | sridharan** 424 Gates Hall |
Tuesdays 2-3:30pm |
Matlab | |
Administrative Assistant Megan Gatch | mlg34*
|
|||
TA Mevlana Gemici | mevlana** | Sundays 2-4pm, Upson 328B, bay A
|
Python and Matlab, including numpy, scipy, scikit-learn, matplotlib and other commonly used ML and visualization libraries. I regularly use Theano for GPU computation as well. | |
TA Jack Hessel | jhessel** | Mondays 4-6pm, Upson 360 bay D EXCEPT Monday May 4th, they will be at 7-9pm in Gates 122 (a bigger room!)
|
Python, numpy, scipy, R (though it's not my favorite) and I also think Theano is pretty neat. | |
TA Vikram Rao Sudarshan | vikram** | Fridays 10:30-11:30am, Gates G17 EXCEPT Friday May 8th are moved to Jack Hessel's on Monday May 4th |
Familiar with Python (numpy, scipy and a bit of matplotlib), R (ggplot2 is easy to use and draws beautiful plots) and the GNU toolchain (awk, sed etc.). Have used Matlab a bit. | |
TA Gaurav Aggarwal | ga286* | Wednesdays 12-1pm, Gates G19 and 6:30-7:30pm, Gates G13 EXCEPT Wednesday May 6th are moved to Jack Hessel's on Monday May 4th |
Python, C++ | |
TA Youenn Paris | yp323* | Fridays 2:30-3:30, Gates G13; Saturdays 4-5pm, Upson 328B, Bay A |
Mostly Python, and also Java | |
TA Xihao Zhang | xz458* | Thursdays 7-8pm, Gates G13 |
MATLAB and C++ |
Text
There is no required textbook, given the coverage of the class. We expect to post recommended readings when appropriate.
Coursework
Currently, we are planning roughly three to five assignments (some combination of pencil-and-paper and programming, in whatever language you wish to use) and two major programming projects (which we'll be running as “competitions” for fun, but your grades will be determined by proficiency, not by how you rank!), including a final project. No exams are planned.
Related courses offered in Spring 2015
- CS 4300 Language and Information (note new syllabus)
- CS 4740/5740 Natural Language Processing
- CS 4850: Mathematical Foundations for the Information Age (very focused on data analysis, and thus quite appropriate)
- ORIE 4740 Statistical data mining I (syllabus for a previous running here)
- STSCI 4780 Bayesian Data Analysis: Principles and Practice.
Here is a list of other machine learning courses at Cornell.
Academic Integrity
We distinguish between “merely” violating the rules for a given assignment and violating the principles of academic integrity. Academic and scientific integrity compels one to properly attribute to others any work, ideas, or phrasing that one did not create oneself. To do otherwise is fraud.
We emphasize certain points here. The way to avoid violating academic integrity is to always document any portions of work you submit that are due to or influenced by other sources even if those sources weren't permitted by the rules. The worst-case outcome for merely breaking the rules is a grade penalty; the worst-case scenario in the fraud scenario is academic-integrity hearing procedures (on top of grade penalties).
A general rule of thumb is to acknowledge the work and contributions and ideas and words and wordings of others. Do not copy or slightly reword portions of papers, Wikipedia articles, textbooks, other students' work, something you heard from a talk or a conversation, or anything else, really, without acknowledging your sources. See http://www.cs.cornell.edu/courses/cs6742/2011sp/handouts/ack-others.pdf
(We make an exception for sources that can be taken for granted in the instructional setting, namely, the course materials. To minimize documentation effort, we also do not expect you to credit the course staff for ideas you get from them, although it's nice to do so anyway.)
For more information on Cornell's policies, see http://www.theuniversityfaculty.cornell.edu/AcadInteg/
I take violations of the Code of Academic Integrity and the principles behind it very seriously, and have assigned failing grades for such violations in the past.