- About
- Events
- Calendar
- Graduation Information
- Cornell Learning Machines Seminar
- Student Colloquium
- BOOM
- Spring 2025 Colloquium
- Conway-Walker Lecture Series
- Salton 2024 Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University / Cornell Tech - High School Programming Workshop and Contest 2025
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- ACSU Research Night
- Cornell Junior Theorists' Workshop 2024
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Computer Science Graduate Office Hours
- Advising Guide for Research Students
- Business Card Policy
- Cornell Tech
- Curricular Practical Training
- A & B Exam Scheduling Guidelines
- Fellowship Opportunities
- Field of Computer Science Ph.D. Student Handbook
- Graduate TA Handbook
- Field A Exam Summary Form
- Graduate School Forms
- Instructor / TA Application
- Ph.D. Requirements
- Ph.D. Student Financial Support
- Special Committee Selection
- Travel Funding Opportunities
- Travel Reimbursement Guide
- The Outside Minor Requirement
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Student Spotlights
- Contact PhD Office
Talk title: Randomness-Aware Testing of Machine Learning-based Systems
Abstract: Machine Learning is rapidly revolutionizing the way modern-day systems are developed. However, testing Machine Learning-based systems is challenging due to 1) the presence of non-determinism, both internal (e.g., stochastic algorithms) and external (e.g., execution environment), and 2) the absence of well-defined accuracy specifications. Most traditional software testing techniques widely used today cannot tackle these challenges because they often assume determinism and require a precise test oracle.
In this talk, I will present my work on automated testing of Machine Learning-based systems and on improving developer-written tests in such systems. To achieve these goals, I develop principled techniques that build on solid mathematical foundations from probability theory and statistics to reason about the underlying non-determinism and accuracy. To date, my research has exposed more than 50 bugs and improved the quality of more than 200 tests in over 60 popular open-source ML libraries, many of which are widely used at companies like Google, Meta, Microsoft, and Uber as well as in many academic and scientific communities.
Finally, I will also briefly talk about my recent research on leveraging Large Language Models to solve software engineering tasks, such as detecting security bugs via static analysis.
Bio:
Saikat Dutta is a tenure-track Assistant Professor at Cornell University. His research lies at the intersection of software engineering and machine learning. In particular, his research focuses on developing novel testing techniques and tools to improve the reliability of machine learning-based systems and leveraging machine learning to solve challenging tasks in software engineering. His research has been recognized by several awards including the Facebook PhD Fellowship, the 3M Foundation Fellowship, and the Mavis Future Faculty Fellowship. Saikat received his PhD in Computer Science from UIUC in 2023 and spent a year as a postdoc at the University of Pennsylvania before joining Cornell.