Schedule

Below is the tentative schedule for the course. The schedule is subject to change at any time.

Date Topic Presenter
  SE Basics  
26-Aug-24 Intro and Course Details Saikat
28-Aug-24 Program Analysis 1 Saikat
2-Sep-24 Labor Day NO CLASS  
4-Sep-24 Program Analysis 2 Saikat
9-Sep-24 Software Testing Saikat
11-Sep-24 Debugging Saikat
  LLM Basics  
16-Sep-24 ML Models: Intro Saikat
18-Sep-24 LLMs for Code (CodeBert/T5/CodeLlama) Claas
  Primary: CodeBERT: A Pre-Trained Model for Programming and Natural Languages  
  Secondary: CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation  
  GraphCodeBERT: Pre-training Code Representations with Data Flow  
23-Sep-24 Fine-Tuning Yuhao
  LoRA: Low-Rank Adaptation of Large Language Models  
  QLoRA: Efficient Finetuning of Quantized LLMs  
25-Sep-24 Evaluating LLMs David
  Primary: Evaluating Large Language Models Trained on Code  
  Secondary: Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation  
  ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation  
  CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution  
30-Sep-24 Proposal Presentations  
  ML4SE  
2-Oct-24 Fuzzing with LLMs Elaine
  Primary: Codamosa: Escaping coverage plateaus in test generation with pre-trained large language models  
  Secondary: Automated Unit Test Improvement using Large Language Models at Meta  
  {FuzzGuard}: Filtering out unreachable inputs in directed grey-box fuzzing through deep learning  
  Can Large Language Models Write Good Property-Based Tests?  
7-Oct-24 Program Analysis with LLMs Kassandra
  Primary: Enhancing Static Analysis for Practical Bug Detection: An LLM-Integrated Approach  
  Secondary: Large Language Models for Code Analysis: Do LLMs Really Do Their Job?  
9-Oct-24 Program Repair with LLMs and Agents Arnav
  Primary: AutoCodeRover: Autonomous Program Improvement  
  Secondary: AGENTLESS: Demystifying LLM-based Software Engineering Agents  
  Swe-bench: Can language models resolve real-world github issues?  
14-Oct-24 Fall break: No class  
16-Oct-24 Verification Aditya
  Primary: Baldur: Whole-proof generation and repair with large language models  
  Secondary: Lemur: Integrating large language models in automated program verification  
21-Oct-24 Security Ethan
  Primary: Large language models for code: Security hardening and adversarial testing  
  Secondary: NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional Correctness  
23-Oct-24 Guest Lecture: Prof. Pengyu Nie (University of Waterloo)  
  Primary: Learning Deep Semantics for Test Completion  
  Secondary: Generating Exceptional Behavior Tests with Reasoning Augmented Large Language Models  
  SE4ML  
28-Oct-24 Test Oracle Generation Pengyue
  Primary: Toga: A neural method for test oracle generation  
  Secondary: On learning meaningful assert statements for unit test cases  
30-Oct-24 Code Generation Kevin Cui
  Primary: Monitor-guided decoding of code LMs with static analysis of repository context  
  SynCode: LLM Generation with Grammar Augmentation  
  Secondary: Codeplan: Repository-level coding using llms and planning  
4-Nov-24 Detecting Numerical Errors Kevin Guan
  Primary: Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects  
  Secondary: Discovering Discrepancies in Numerical Libraries  
  Exposing numerical bugs in deep learning via gradient back-propagation  
6-Nov-24 Testing DL Systems Ibrahim
  Deepxplore: Automated whitebox testing of deep learning systems  
  Dlfuzz: Differential fuzzing testing of deep learning systems  
11-Nov-24 Guest Lecture: Chenyuan Yang (University of Illinois Urbana-Champaign)  
  Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models  
  WhiteFox: White-box Compiler Fuzzing Empowered by Large Language Models  
13-Nov-24 Testing DNNs Shinhae
  Primary: Deephunter: a coverage-guided fuzz testing framework for deep neural networks  
  Secondary: Tensorfuzz: Debugging neural networks with coverage-guided fuzzing  
  Concolic testing for deep neural networks  
  Symbolic execution for attribution and attack synthesis in neural networks  
18-Nov-24 DNN model versioning and management Ika
  Git-theta: A git extension for collaborative development of machine learning models  
  MGit: A Model Versioning and Management System  
20-Nov-24 MLOps Mudit
  Primary: Hidden technical debt in machine learning systems  
  Secondary: “We Have No Idea How Models will Behave in Production until Production”: How Engineers Operationalize Machine Learning  
  An Analysis of MLOps Architectures: A Systematic Mapping Study  
25-Nov-24 Testing DL Libraries Charles
  Primary: Deep Learning Library Testing via Effective Model Generation  
  Secondary: NeuRI: Diversifying DNN Generation via Inductive Rule Inference  
  Docter: Documentation-guided fuzzing for testing deep learning api functions  
  DLLens: Testing Deep Learning Libraries via LLM-aided Synthesis  
27-Nov-24 Thanksgiving break NO CLASS  
2-Dec-24 Project Presentations  
4-Dec-24 Project Presentations  
9-Dec-24 Project Presentations