I'm a postdoctoral researcher with Michael I. Jordan in the Statistics and EECS departments at UC Berkeley. I work on developing methods to analyze modern scientific data sets, leveraging sophisticated black box models while providing rigorous statistical guarantees. Specifically, I work on problems in high-dimensional statistics (especially false discovery rate control), statistical machine learning, conformal prediction and causal inference.

Previously, I completed my Ph.D. in the Stanford Department of Statistics advised by Emmanuel Candès. My thesis introduced methods for conditional independence testing and false discovery rate control in genomics, and I was honored to receive the Ric Weiland Graduate Fellowship and the Theodore W. Anderson Theory of Statistics Dissertation Award for this work. Before my Ph.D., I studied statistics and mathematics at Harvard University, and spent a year teaching mathematics at NYU Shanghai. Outside research, I enjoy triathlons, sailing, hiking, and reading speculative fiction novels.


I'm co-organizing the 2021 ICML Workshop on Distribution-free Uncertainty Quantification, which will take place on Saturday, July 24, 2021.

Select recent papers

“Cross-validation: what does it estimate and how well does it do it?”

S. Bates, T. Hastie, and R. Tibshirani. arXiv preprint, 2021.
[arXiv] [code] [bibtex]

“Distribution-Free, Risk-Controlling Prediction Sets”

S. Bates, A. Angelopoulos, L. Lei, J. Malik, and M. I. Jordan. arXiv preprint, 2021.
[arXiv] [video] [blog] [code] [bibtex]

“Causal Inference in Genetic Trio Studies”

S. Bates, M. Sesia, C. Sabatti, and E. Candès. PNAS, 2020.
[arXiv] [journal] [video] [tutorials+code] [bibtex]
*Selected as a cover article and for invited commentary.