Saikat Dutta
438 Gates Hall
Department of Computer Science
Cornell University
Ithaca, NY, 14850
USA
Saikat Dutta
I am an Assistant Professor in the Department of Computer Science at Cornell University. My research interests are at the intersection of Software Engineering and Machine Learning. I am a member of the growing Software Engineering Group at Cornell.

I received my PhD in Computer Science from the University of Illinois Urbana-Champaign in Summer 2023, advised by Prof. Sasa Misailovic. Before joining Cornell, I spent a year as a Postdoctoral Researcher at the University of Pennsylvania, working with Prof. Mayur Naik. You can find my CV here.

Prospective students: I am looking for skilled and motivated undergraduates, PhD students, and postdocs to join my group. If you are interested in working with me, please drop me an email. If you are a prospective PhD student, apply to the Cornell CS PhD program.

Research Interests
My research interests are at the intersection of Software Engineering and Machine Learning. I am particularly interested in 1) developing novel techniques and tools to improve the reliability of Machine Learning-based systems, and 2) leveraging Machine Learning to solve challenging tasks in Software Engineering.

My research focuses on following themes: Apart from these topics, I also developed novel inference algorithms and robustness analyses for probabilistic programming [UAI 2023] [ATVA 2021].
Students
Current PhD students:
Previous undergraduates:
  • Rohan Kalluraya (BS, Cornell; Summer 2024)
  • Nathan Chu (BS, Cornell; Spring 2024)
News
  • New:
    Dec 2024: Our paper Evaluating the Effectiveness of LLMs in Detecting Security Vulnerabilities has been accepted to ICST 2025!
  • New:
    Dec 2024: Our paper CoCoNUT: Structural Code Understanding does not fall out of a tree has been accepted to LLM4Code 2025!
  • New:
    August 2024: Our paper GlueTest: Testing Code Translation via Language Interoperability has been accepted to ICSME NIER 2024!
  • New:
    August 2024: Started as an Assistant Professor in the CS Department at Cornell University!
  • Passed PhD Final Defense! Find my dissertation here.
  • More News
  • Our paper ASTRA: Understanding the Practical Impact of Robustness for Probabilistic Programs has been accepted to UAI 2023!
  • Our paper Balancing Effectiveness and Flakiness of Non-Deterministic Machine Learning Tests has been accepted to ICSE 2023!
  • Our paper To Seed or Not to Seed? An Empirical Analysis of Usage of Seeds for Testing in Machine Learning Projects has been accepted to ICST 2022!
  • Our paper InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript has been accepted to ICSE-SEIP 2022!
  • Our paper SixthSense: Debugging Convergence Problems in Probabilistic Programs via Program Representation Learning has been accepted to FASE 2022!
  • Our paper AQUA: Automated Quantized Inference for Probabilistic Programs has been accepted to ATVA 2021!
  • Our paper TERA: Optimizing Stochastic Regression Tests in Machine Learning Projects has been accepted to ISSTA 2021!
  • Our paper FLEX: Fixing Flaky Tests in Machine-Learning Projects by Updating Assertion Bounds has been accepted to ESEC/FSE 2021!
  • I will be interning at Amazon Web Services (AWS) with the Automated Reasoning Group (ARG) for Summer 2021!
  • Our paper on Detecting Flaky Tests in Probabilistic and Machine Learning Applications was accepted to ISSTA 2020!
  • I will be interning at Microsoft Research, Redmond with the RISE group for Summer 2020!
    Looking forward to it!
  • Awarded Facebook PhD Fellowship 2020
    Thanks Facebook!
  • Our paper, Storm: Program Reduction for Testing and Debugging Probabilistic Programming Systems, has been accepted to FSE 2019
  • Selected for 3M Foundation Fellowship 2018-19
  • Our paper on ProbFuzz, "Testing Probabilistic programming systems" has been accepted to FSE 2018
  • Attended PLDI 2018 at Philadelphia, USA (20-22 June, 2018)
  • Our recent work on Automated Sensitivity Analysis was published in IEEE TSE Volume 43, Issue 12
  • Attended Midwest Programming Language Summit 2017 at Bloomington, Indiana
  • Attended Automated Software Engineering Conference (ASE 2017) at UIUC
Selected Publications
Balancing Effectiveness and Flakiness of Non-Deterministic Machine Learning Tests
45th International Conference on Software Engineering (ICSE 2023)
Steven Xia, Saikat Dutta, Sasa Misailovic, Darko Marinov, and Lingming Zhang
FLEX: Fixing Flaky Tests in Machine-Learning Projects by Updating Assertion Bounds
29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE 2021)
Saikat Dutta, August Shi, and Sasa Misailovic
All Publications
Preprints:
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning
Aaditya Naik, Jason Liu, Claire Wang, Saikat Dutta, Mayur Naik, Eric Wong
October 2024
LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
Ziyang Li, Saikat Dutta, Mayur Naik
May 2024
Peer-Reviewed:
2025
CoCoNUT: Structural Code Understanding does not fall out of a tree
2nd International Workshop on Large Language Models for Code (LLM4Code 2025)
Claas Beger, Saikat Dutta
Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities
18th IEEE International Conference on Software Testing, Verification and Validation (ICST 2025)
Avishree Khare, Saikat Dutta, Ziyang Li, Alaia Solko-Breslin, Rajeev Alur, Mayur Naik
2024
GlueTest: Testing Code Translation via Language Interoperability
40th International Conference on Software Maintenance and Evolution: New Ideas and Emerging Results (ICSME NIER 2024)
Flagstaff, AZ, USA. Acceptance Rate 29% (10/35 papers)
Muhammad Salman Abid, Mrigank Pawagi, Sugam Adhikari, Xuyan Cheng, Ryed Badr, Md Wahiduzzaman, Vedant Rathi, Ronghui Qi, Choiyin Li, Lu-Chi Liu, Rohit Sai Naidu, Licheng Lin, Que Liu, Asif Zubayer Palak, Mehzabin Haque, Xinyu Chen, Darko Marinov, and Saikat Dutta
Debugging Convergence Problems in Probabilistic Programs via Program Representation Learning with SixthSense
The International Journal on Software Tools for Technology Transfer (STTT 2024)
Zixin Huang, Saikat Dutta, and Sasa Misailovic
Extended version of the FASE 2022 paper
2023
Randomness-Aware Testing of Machine Learning-based Systems
Ph.D. Dissertation, University of Illinois Urbana-Champaign, July 2023
Saikat Dutta
ASTRA: Understanding the Practical Impact of Robustness for Probabilistic Programs
39th Conference on Uncertainty in Artificial Intelligence (UAI 2023)
Pittsburgh, PA, August 2023. Acceptance Rate 31% (243/778 papers)
Zixin Huang, Saikat Dutta, and Sasa Misailovic
Balancing Effectiveness and Flakiness of Non-Deterministic Machine Learning Tests
45th International Conference on Software Engineering (ICSE 2023)
Melbourne, Australia, May 2023. Acceptance Rate 26% (208/796 papers)
Steven Xia, Saikat Dutta, Sasa Misailovic, Darko Marinov, and Lingming Zhang
2022
To Seed or Not to Seed? An Empirical Analysis of Usage of Seeds for Testing in Machine Learning Projects
15th IEEE International Conference on Software Testing, Verification and Validation (ICST 2022)
Valencia, Spain, April 2022. Acceptance Rate 29% (25/85 papers)
Saikat Dutta, Anshul Arunachalam and Sasa Misailovic
InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript
44th International Conference on Software Engineering - Software Engineering in Practice (ICSE-SEIP 2022)
Pittsburgh, USA, May 2022.
Saikat Dutta, Diego Garbervetsky, Shuvendu Lahiri, Max Schäfer
SixthSense: Debugging Convergence Problems in Probabilistic Programs via Program Representation Learning
25th International Conference on Fundamental Approaches to Software Engineering (FASE 2022)
Munich, Germany, April 2022. Acceptance Rate 27% (17/62 papers)
Saikat Dutta, Zixin Huang, and Sasa Misailovic
2021
Automated Quantized Inference for Probabilistic Programs with AQUA
Innovations in Systems and Software Engineering: A NASA Journal (ISSE NASA)
Zixin Huang, Saikat Dutta, and Sasa Misailovic
Extended version of our ATVA 2021 paper
AQUA: Automated Quantized Inference for Probabilistic Programs
19th International Symposium on Automated Technology for Verification and Analysis (ATVA 2021)
Gold Coast, Australia, October 2021. Acceptance Rate 27% (19/71 papers)
Zixin Huang, Saikat Dutta, and Sasa Misailovic
FLEX: Fixing Flaky Tests in Machine-Learning Projects by Updating Assertion Bounds
29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE 2021)
Athens, Greece, August 2021. Acceptance rate 24% (97/396 papers)
Saikat Dutta, August Shi, and Sasa Misailovic
TERA: Optimizing Stochastic Regression Tests in Machine Learning Projects
30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2021)
Aarhus, Denmark, July 2021. Acceptance rate 22% (51/233 papers)
Saikat Dutta, Jeeva Selvam, Aryaman Jain, and Sasa Misailovic
2020
Detecting Flaky Tests in Probabilistic and Machine Learning Applications
29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2020)
Los Angeles, CA, USA, July 2020. Acceptance rate 26% (43/162 papers)
Saikat Dutta, August Shi, Rutvik Choudhary, Zhekun Zhang, Aryaman Jain, and Sasa Misailovic
2019
Storm: Program Reduction for Testing and Debugging Probabilistic Programming Systems
27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019)
Talin, Estonia, August 2019. Acceptance rate 24% (74/303 papers)
Saikat Dutta, Wenxian Zhang, Zixin Huang, Sasa Misailovic
2018
Testing Probabilistic Programming Systems
26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018)
Lake Buena Vista, FL, USA, November 2018. Acceptance rate 21% (61/289 papers)
Saikat Dutta, Owolabi Legunsen, Zixin Huang, Sasa Misailovic
2013-17
AutoSense: A Framework for Automated Sensitivity Analysis of Program Data
IEEE Transactions on Software Engineering (TSE 2017)
Bernard Nongpoh, Rajarshi Ray, Saikat Dutta, Ansuman Banerjee
Enhancing branch prediction using software evolution
10th IEEE International Conference on Networking, Architecture, and Storage (NAS 2015)
Saikat Dutta, Moumita Das, Ansuman Banerjee
A New Approach for Minimal Environment Construction for Modular Property Verification
24th Asian Test Symposium (ATS 2015)
Saikat Dutta, Soumi Chattopadhyay, Ansuman Banerjee, Pallab Dasgupta
A Framework for Fast Service Verification and Query Execution for Boolean Service Rules>
9th Asia-Pacific Services Computing Conference (APSCC 2015)
Soumi Chattopadhyay, Saikat Dutta, Ansuman Banerjee
Daikon to Prioritize and Group Unit Bugs
Formal Aspects of Component Software - 10th International Symposium (FACS 2013)
Nehul Jain, Saikat Dutta, Ansuman Banerjee, Anil K. Ghosh, Lihua Xu, Huibiao Zhu
Service
ISSTA 2025, LLM4Code 2025 Program Committee
ASE 2024, MLSYS 2024 Program Committee
TSE 2022 Reviewer
PLDI 2021 Artifact Evaluation Committee
OOPSLA 2020 Artifact Evaluation Committee