“These program [ed. distributed programs] form the backbone of many modern internet services. For example, they power large databases that store data in the cloud. In my thesis, I introduce new techniques to find correctness bugs in distributed programs.” Srinidhi Nagendra, former PhD student at IRIF.
Link to his PhD thesis: Automated testing of distributed protocol implementations
I'm Srinidhi Nagendra, and recently I completed a Joint PhD in Computer Science from Université Paris Cité (IRIF) and Chennai Mathematical Institute. During my PhD, I was supervised by Constantin Enea and Mandayam Srivas. Prior to my PhD, I completed a Master of Science in Computer Science from Chennai Mathematical Institute, India.
I was initially trained as an engineer and worked as a Software Developer for a couple years. In trying to better understand the tools I was using and the algorithms I was implementing, I developed an interest in further studies which naturally led me to pursue research. In particular, I was interested in algorithms and programs that allowed multiple computers to talk and co-ordinate.
My thesis, titled “Automated testing of distributed protocol implementations”, focuses on ensuring the correctness of programs that allow different computers (processes) to coordinate and collectively solve a program (distributed programs). These programs form the backbone of many modern internet services. For example, they power large databases that store data in the cloud. Bugs in distributed programs can be costly since they cause downtimes in the internet services. It is also hard to implement the distributed programs correctly due to the complex nature of the underlying protocols. In my thesis, I introduce new techniques to find correctness bugs in distributed programs, thereby improving the reliability of the services built on top of them.
Distributed programs (implementations) follow rules defined by the distributed algorithm (model). There exists a large body of research that ensures the correctness of the model independent of the implementations. In my thesis, I leverage existing work done on the model, combine it with existing techniques to develop new techniques. I also show in the thesis that the new techniques are more effective in finding bugs and helping improve the reliability of distributed programs.
These are new techniques aimed directly at checking the correctness of the implementations. In each new technique, I borrow from the literature an existing approach that has not yet been applied to distributed programs and demonstrate their effectiveness. Netrix - introduces unit testing for distributed programs, WaypointRL - leverages Reinforcement Learning to automatically explore all possible states, and ModelFuzz - introduces a novel guidance mechanism to fuzz test distributed programs.
Demonstrating the new techniques resulted in novel tools that one can use to test distributed programs. The tools can be used to effectively test new or existing implementations of distributed programs - programs that power databases.
I have started a new journey as a Post doctoral researcher at Max Planck Institute for Software Systems working with Rupak Majumdar. Together, we aim to scale verification techniques such as Partial Order Reduction to verify Distributed programs.