Statistics seminar: Weizhi Li, Stat Sci Group, LANL
Event Description:
Label-Efficient Two-Sample Tests
Hypothesis testing is a statistical inference approach used to determine whether data
supports a specific hypothesis. An important type is the two-sample test, which evaluates
whether two sets of data points are from identical distributions. This test is widely used,
such as by clinical researchers comparing treatment effectiveness. This talk explores two-
sample testing in a context where an analyst has many features from two samples, but
determining the sample membership (or labels) of these features is costly. In machine
learning, a similar scenario is studied in active learning. In this talk, I will present our work
on incorporating active learning into two-sample testing within this label-costly setting
while maintaining statistical validity and high testing power. Importantly, I will highlight the
practical values of the proposed two-sample tests.
Biography
Dr. Weizhi Li currently serves as a Postdoctoral Research Associate within the Statistical
Science group at Los Alamos National Laboratory (LANL). He earned his Ph.D. in Computer
Engineering from Arizona State University (ASU) in 2022. Prior to joining LANL, he
completed a postdoctoral fellowship at ASU and worked as a research scientist at Meta. Dr.
Li specializes in developing data-efficient machine learning algorithms through statistical
approaches. He has published multiple works in top-tier machine learning venues, including
AISTATS, UAI, NeurIPS, and the Transactions on Machine Learning Research.