General Information | Course Description | Textbooks | Grading | Ed Discussions | Homework | Course Conduct | Accommodations
General Information
Instructor: Bobby Kleinberg, Gates Hall 317, emailOffice Hours: Mondays 2:30–3:30pm; Wednesdays 10–11am
Zoom link — https://cornell.zoom.us/j/99763995306?pwd=N2F0Yksyd1htRFVhT1ZSK2tiYnRNQT09).
Lectures: MWF 9:05am–9:55amZoom link — https://cornell.zoom.us/j/92674518945?pwd=QWl3VlZCblhpL1J0RnJkNDhldU9YZz09.
Instructions for phone access here.
Lectures will be given live during this time and you are encouraged to attend. Lectures will also be recorded and available for asynchronous viewing. There will be links to the video+audio and audio-only for each lecture on the schedule page, along with any notes or handouts associated with that lecture. The raw recordings will be available soon after the lecture, to be replaced by edited and closed-captioned versions as soon as they can be processed.
Zoom etiquette If you attend the live lecture, please enable your camera. You may ask questions at any time during the lecture. Type your question in the chat window. A staff member will be monitoring the chat window and will either respond with an answer in the chat or will interrupt the instructor at an appropriate moment.
External links
- Ed Discussions: https://edstem.org/us/courses/4487/discussion/
- Homework Submission: CMS (homework submission)
Course Description
Probability and high-dimensional geometry have become valuable tools in the analysis of algorithms. This course will explore the mathematics that lies behind designing and analyzing algorithms for massive, high-dimensional, and often random, data. Topics to be covered include dimensionality reduction for high-dimensional data; random walks and Markov Chain Monte Carlo algorithms; randomized algorithms and data structures for hashing, data sketching, and data stream processing; random graphs; the probabilistic method; and algorithms for detecting sparse or low-rank structures in data.
Textbooks
The textbook for the course is Foundations of Data Science
by Avrim Blum, John Hopcroft, and Ravi Kannan.
Although this book was designed for this course, there will be topics covered in lecture that are not in the text and there will be topics in the text that are not covered in lecture. You are responsible for topics covered in lecture and for any assigned reading in the text.
This course is enrolled in the Instant Access Program, providing digital textbooks or publisher courseware directly in Canvas. You have access to your digital course materials directly in Canvas once the course opens. You have a trial period to decide whether to continue accessing the digital course materials or opt-out.
Instant Access is an opt-out program for required course materials. You may have the option to opt-in for optional course materials. You can access required course materials without charge until the opt-out deadline of February 14, 2022. After this date, you will be bursar billed for continued use of the eBook unless you have opted-out.
To Opt-Out of required course materials: If you do not wish to have access to the eBook or courseware through Instant Access, you may opt-out by clicking the Instant Access - VitalSource tool in the Canvas course's navigation panel, then OPT-OUT. If you opt-out, you will lose access to the digital course materials and it may be your responsibility to acquire the appropriate course materials on your own to satisfy course requirements.
For Instant Access assistance, please contact instantaccess@cornell.edu.
The following (optional) books are also useful references.
- M. Mitzenmacher and E. Upfal. Probability and Computing: Randomized Algorithms and Probabilistic Analysis.
- R. Motwani and P. Raghavan. Randomized Algorithm.
- J. Kleinberg and É. Tardos. Algorithm Design.
Prerequisites
The prerequisites for the course are CS 2800 and 2110, and a linear algebra course at the level of MATH 2210/2940 or above. CS 4820 is highly recommended but not required. A prior course in probability (such as ENGRD 2700, MATH 4710, ECE 3100, BTRY 3080, ECON 3130) is also recommended but not required.
Grading
For CS 4850, your grade will be based on seven homework assignments, four in-class quizzes, one final exam, and completion of a course evaluation. For CS 5850, your grade will be based on all of the above, plus a final project.
Each of these components will be given a weight in the following ranges:Component | 4850 range | 5850 range |
---|---|---|
Homework | 40% to 60% | 35% to 50% |
Quizzes | 15% to 40% | 15% to 35% |
Final Exam | 0% to 35% | 0% to 30% |
Final Project | 0% | 15% to 30% |
Completion of course evaluation: | 1% | 1% |
Ed Discussions
We will be using Ed Discussions as an online discussion forum. Ed Discussions allows for open discussions of all course-related questions. You are encouraged to post any questions you might have about the course material. The course staff monitor Ed Discussions closely and you will usually get a quick response. If you know the answer to a question, you are encouraged to post it. Posting questions or answers that are endorsed by TAs or instructors can improve your participation grade.
By default, your posts are visible to the course staff and other students, and you should prefer this mode so that others can benefit from your question and the answer. However, you can post privately so that only the course staff can see your question, and you should do so if your post might reveal information about a solution to a homework problem. You can also post anonymously if you wish. If you post privately, we reserve the right to make your question public if we think the class will benefit.
Ed Discussions is the most effective way to communicate with course staff. Please avoid email if Ed Discussions will do. Broadcast messages from the course staff to students will be sent using Ed Discussions and all course announcements will be posted there and pinned, so check the pinned announcements often!
Homework
Homework is an important part of the course. We will have homework assignments every week or two. All homework assignments will be posted on CMS. Most homework assignments will be due on Wednesday at 11:59pm.
Typesetting
We will require problem sets to be typeset and submitted as a PDF. This requirement is for everyone's benefit. In general, we recommend that you first develop your solutions in draft form, and then write or type your solution in a concise way. Typesetting not only makes the last step essential (instead of handing in solution in draft form), it also makes it much easier for you to edit and improve your writeup, as well as easier for your TAs to read your proofs. It is up to you which tool you use; though we recommend LaTeX, tools like the Equation Editor in Microsoft Word can be surprisingly effective as an alternative. See typesetting resources for a list of typesetting software and references.
For some proofs or writeups, it may be helpful to use a figure to explain your thinking more concisely. This is encouraged! Again, it is up to you how you want to include that in your writeup, whether it is a picture of a drawing in your notebook that you took with your phone or something you made digitally, as long as the figure was produced by you personally and is clear enough to see, it's a great idea to include it.
Collaboration
In the real world of algorithms research, collaboration and conversation is an important part of how ideas get generated. So too in this course; we encourage you to work in groups of up to three peers on the homework. For homework questions that require you to write a piece of code, every member of the group can submit the same code. However, for homework questions that require a written solution to a problem, your solution must be written up completely on your own. Except for problems that involve writing code, you are not allowed to share digital or written notes or images of your work in any form with each other. Just like in research, your work must also include acknowledgements of all students with whom you collaborated. Both the physical or digital distribution of information about solutions and the failure to acknowledge collaborators are serious violations of academic integrity.
Admissible Resources
For the homework, it is not admissible to use resources beyond course material and student discussions. In particular, you may not use Wikipedia, or search the Web, or look at any textbook, other than the ones assigned/recommended in the course. Using such additional resources is a violation of academic integrity. If you feel the resources available to you are insufficient, talk to course staff or ask questions on Ed Discussions.
Advice for Success
Algorithms assignments can often require creative insights and complex proofs beyond what previous courses have required. Here are a few tips for succeeding in your writeups:- Start your assignments early. Even if you aren't writing anything down yet, looking over the problem set well in advance of the due date can ensure you have enough time to brainstorm possible solutions and to clear up confusion about how to interpret a problem. Creativity doesn't work well on a deadline.
- Work with your group-mates to come up with ideas. As previously stated, while you cannot share physical or digital solutions of any kind to these problems (except those that involve writing a piece of code), we actively encourage you to join a group and work through the problems with your homework partners. At the beginning of the semester, the TA's will facilitate matching students with homework partners, for those who wish to get help in finding partners.
- Visit office hours. One of the best ways to get help with the homework is to bring questions to office hours. Homework help in office hours is most effective when the student has already made a real effort to solve the problem being discussed and is prepared to describe what they tried and where they're getting stuck.
- Ask questions in class and on Ed Discussions. The material in this class moves quickly and is often cumulative. If you find yourself scratching your head after a lecture, even after consulting the textbook and course notes, you're certainly not alone, and it's better to seek help then than to wait until you are more confused.
Course Material Copyright
Course materials posted on this website, Ed Discussions, or CMS, are intellectual property belonging to the author. Students are not permitted to post course materials on the web, share them on any online platform, buy, sell, or distribute them outside of Cornell without the express permission of the instructor. Such unauthorized behavior constitutes academic misconduct.
Course Conduct
We understand that our members represent a rich variety of backgrounds and perspectives. Cornell University is committed to providing an atmosphere for learning that respects diversity. We expect students to communicate in a respectful manner with the instructors, course staff, and fellow students, in a way the honors the unique experiences, values, and beliefs represented by different members of our community.
Academic Integrity
Any violation of academic integrity will be severely penalized. You are allowed to collaborate on the homework to the extent of formulating ideas as a group. However, you are expected to write up (and understand) the homework on your own, and you should acknowledge the names of the students with whom you collaborated.
From Cornell's code of academic integrity:
Absolute integrity is expected of every Cornell student in all academic undertakings. Integrity entails a firm adherence to a set of values, and the values most essential to an academic community are grounded on the concept of honesty with respect to the intellectual efforts of oneself and others. Academic integrity is expected not only in formal coursework situations, but in all University relationships and interactions connected to the educational process, including the use of University resources. […]
A Cornell student's submission of work for academic credit indicates that the work is the student's own. All outside assistance should be acknowledged, and the student's academic position truthfully reported at all times. In addition, Cornell students have a right to expect academic integrity from each of their peers.