| SE Basics | |
26-Aug-24 | Intro and Course Details | Saikat |
28-Aug-24 | Program Analysis 1 | Saikat |
2-Sep-24 | Labor Day NO CLASS | |
4-Sep-24 | Program Analysis 2 | Saikat |
9-Sep-24 | Software Testing | Saikat |
11-Sep-24 | Debugging | Saikat |
| LLM Basics | |
16-Sep-24 | ML Models: Intro | Saikat |
18-Sep-24 | LLMs for Code (CodeBert/T5/CodeLlama) | Claas |
| Primary: CodeBERT: A Pre-Trained Model for Programming and Natural Languages | |
| Secondary: CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation | |
| GraphCodeBERT: Pre-training Code Representations with Data Flow | |
23-Sep-24 | Fine-Tuning | Yuhao |
| LoRA: Low-Rank Adaptation of Large Language Models | |
| QLoRA: Efficient Finetuning of Quantized LLMs | |
25-Sep-24 | Evaluating LLMs | David |
| Primary: Evaluating Large Language Models Trained on Code | |
| Secondary: Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation | |
| ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation | |
| CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution | |
30-Sep-24 | Proposal Presentations | |
| ML4SE | |
2-Oct-24 | Fuzzing with LLMs | Elaine |
| Primary: Codamosa: Escaping coverage plateaus in test generation with pre-trained large language models | |
| Secondary: Automated Unit Test Improvement using Large Language Models at Meta | |
| {FuzzGuard}: Filtering out unreachable inputs in directed grey-box fuzzing through deep learning | |
| Can Large Language Models Write Good Property-Based Tests? | |
7-Oct-24 | Program Analysis with LLMs | Kassandra |
| Primary: Enhancing Static Analysis for Practical Bug Detection: An LLM-Integrated Approach | |
| Secondary: Large Language Models for Code Analysis: Do LLMs Really Do Their Job? | |
9-Oct-24 | Program Repair with LLMs and Agents | Arnav |
| Primary: AutoCodeRover: Autonomous Program Improvement | |
| Secondary: AGENTLESS: Demystifying LLM-based Software Engineering Agents | |
| Swe-bench: Can language models resolve real-world github issues? | |
14-Oct-24 | Fall break: No class | |
16-Oct-24 | Verification | Aditya |
| Primary: Baldur: Whole-proof generation and repair with large language models | |
| Secondary: Lemur: Integrating large language models in automated program verification | |
21-Oct-24 | Security | Ethan |
| Primary: Large language models for code: Security hardening and adversarial testing | |
| Secondary: NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional Correctness | |
23-Oct-24 | Guest Lecture: Prof. Pengyu Nie (University of Waterloo) | |
| Primary: Learning Deep Semantics for Test Completion | |
| Secondary: Generating Exceptional Behavior Tests with Reasoning Augmented Large Language Models | |
| SE4ML | |
28-Oct-24 | Test Oracle Generation | Pengyue |
| Primary: Toga: A neural method for test oracle generation | |
| Secondary: On learning meaningful assert statements for unit test cases | |
30-Oct-24 | Code Generation | Kevin Cui |
| Primary: Monitor-guided decoding of code LMs with static analysis of repository context | |
| SynCode: LLM Generation with Grammar Augmentation | |
| Secondary: Codeplan: Repository-level coding using llms and planning | |
4-Nov-24 | Detecting Numerical Errors | Kevin Guan |
| Primary: Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects | |
| Secondary: Discovering Discrepancies in Numerical Libraries | |
| Exposing numerical bugs in deep learning via gradient back-propagation | |
6-Nov-24 | Testing DL Systems | Ibrahim |
| Deepxplore: Automated whitebox testing of deep learning systems | |
| Dlfuzz: Differential fuzzing testing of deep learning systems | |
11-Nov-24 | Guest Lecture: Chenyuan Yang (University of Illinois Urbana-Champaign) | |
| Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models | |
| WhiteFox: White-box Compiler Fuzzing Empowered by Large Language Models | |
13-Nov-24 | Testing DNNs | Shinhae |
| Primary: Deephunter: a coverage-guided fuzz testing framework for deep neural networks | |
| Secondary: Tensorfuzz: Debugging neural networks with coverage-guided fuzzing | |
| Concolic testing for deep neural networks | |
| Symbolic execution for attribution and attack synthesis in neural networks | |
18-Nov-24 | DNN model versioning and management | Ika |
| Git-theta: A git extension for collaborative development of machine learning models | |
| MGit: A Model Versioning and Management System | |
20-Nov-24 | MLOps | Mudit |
| Primary: Hidden technical debt in machine learning systems | |
| Secondary: “We Have No Idea How Models will Behave in Production until Production”: How Engineers Operationalize Machine Learning | |
| An Analysis of MLOps Architectures: A Systematic Mapping Study | |
25-Nov-24 | Testing DL Libraries | Charles |
| Primary: Deep Learning Library Testing via Effective Model Generation | |
| Secondary: NeuRI: Diversifying DNN Generation via Inductive Rule Inference | |
| Docter: Documentation-guided fuzzing for testing deep learning api functions | |
| DLLens: Testing Deep Learning Libraries via LLM-aided Synthesis | |
27-Nov-24 | Thanksgiving break NO CLASS | |
2-Dec-24 | Project Presentations | |
4-Dec-24 | Project Presentations | |
9-Dec-24 | Project Presentations | |