This is the archived Fall 2018 site. For 2019, visit http://www.cs.cornell.edu/Courses/cs6465/2019fa
CS6465: Emerging Cloud Technologies and Systems Challenges
Hollister Hall Room 320, Tuesday/Thursday 1:25-2:40
CS6465 is a PhD-level class in systems that tracks emerging cloud computing technology, opportunities and challenges. It is unusual among CS graduate classes: the course is aimed at a small group of students, uses a discussion oriented style, and the main "topic" is actually an unsolved problem in computer systems. The intent is to think about how one might reduce that open problem to subproblems, learn about prior work on those, and extract exciting research questions. The PhD focus centers on that last agenda element.
In this second offering, we plan to focus on issues raised by moving machine learning to the edge of the cloud. In this phrasing, edge computing still occurs within the data center, but for reasons of rapid response, involves smart functionality close to the client, under time pressure. So you would think of an AI or ML algorithm written in as standard a way as possible (perhaps, Tensor Flow, or Spark/Databricks using Hadoop, etc). But whereas normally that sort of code runs deep in the cloud, many minutes or hours from when data is acquired, the goal now is to keep the code unchanged (or minimally changed) and be able to run on the stream of data as it flows into the system, milliseconds after it was acquired. We might also push aspects of machine learned behavior right out to the sensors.
This idea is a big new thing in cloud settings -- they call it "edge" computing or "intelligent" real-time behavior. But today edge computing often requires totally different programming styles than back-end computing. Our angle in cs6465 is really to try and understand why this is so: could we more or less "migrate" code from the back-end to the edge? What edge functionality would this require? Or is there some inherent reason that the techniques used in the back-end platforms simply can't be used in the edge, even with some sort of smart tool trying to help.
The goal of this focus on an intelligent edge is, of course, to motivate research on the topic. As a systems person, Ken's group is thinking about how to build new infrastructure tools for the intelligent edge. Those tools could be the basis of great research papers and might have real impact. But others work in this area too, and we'll want to read papers they have written.
Gaps can arise at other layers too. For example, Tensor Flow is hugely popular at Google in the AI/ML areas, and Spark/Databricks plus Hadoop (plus Kafka, Hive, HBase, Zookeeper, not to mention plus MatLab, SciPy, Graphlab, Pregle, and a gazillion other tools) are insanely widely used. So if we assume that someone is a wizard at solving AI/ML problems using this standard infrastructure, but now wants parts of their code to work on an intelligent edge, what exactly would be needed to make that possible? Perhaps we would need some new knowledge representation, or at least some new way of storing knowledge, indexing it, and searching for it. This would then point to opportunities for research at the AI/ML level as well as opportunities in databases or systems to support those new models of computing.
CS6465 runs as a mix of discussions and short mini-lectures (mostly by the professor), with some small take-home topics that might require a little bit of out-of-class research, thinking and writing. Tthere won't be a required project, or any exams, and the amount of written material required will be small, perhaps a few pages to hand in per week. Grading will mostly be based on in-class participation.
CS6465 can satisfy the same CS graduate requirements (in the systems area) as any other CS6xxx course we offer. Pick the course closest to your interests, no matter what you may have heard. CS6410 has no special status.
Schedule and Readings/Slides