Academic Report Notices（Reference Number: 2025-15）-计算机与信息学院（人工智能学院）

Report Title: Multi-Objective Reinforcement Learning for Optimizing Tolerant Dynamic Decision Rules

Speaker: Professor Lu Wang

Affiliation: University of Michigan, USA

Time: 15:00-17:00, Tuesday, July 22, 2025

Venue: Third Meeting Room, 1st Floor, Block A, Feicui Science and Education Building

Report Abstract:

Many real-world problems involve multiple competing priorities, and decision rules differ when trade-offs are present. Correspondingly, there may be more than one feasible decision that leads to empirically sufficient optimization. In this talk, we present a concept of "tolerant regime", which provides a set of individualized feasible decision rules under a prespecified tolerance rate. A multi-objective tree-based reinforcement learning (MOT-RL) method is developed to directly estimate the tolerant DTR (tDTR) that optimizes multiple objectives in a multistage multi-treatment setting. At each stage, MOT-RL constructs an unsupervised decision tree by modeling the counterfactual mean outcome of each objective via semiparametric regression and maximizing a purity measure constructed by the scalarized augmented inverse probability weighted estimators (SAIPWE). The algorithm is implemented in a backward inductive manner through multiple decision stages, and it estimates the optimal DTR and tDTR depending on the decision-maker's preferences. Multi-objective tree-based reinforcement learning is robust, efficient, easy-to-interpret, and flexible to different settings.

Speaker Profile:

Lu Wang is a Professor in the Department of Biostatistics at the University of Michigan, USA, a Fellow of the American Statistical Association (ASA Fellow), and an Elected Member of the International Statistical Institute (ISI). She received her Bachelor's degree from Peking University in 2002 and her Ph.D. from Harvard University in 2008. Her research fields include statistical methods for evaluating dynamic treatment regimens, personalized medicine, causal inference, nonparametric and semiparametric regression, missing data analysis, and longitudinal (correlated/clustered) data analysis. She has published more than 180 papers in academic journals such as JASA, Biometrika, Biometrics and AoAS, and co-authored a chapter of a book. She currently serves as an Associate Editor of JASA and Biometrics.

Academics

Report

Academic Report Notices（Reference Number: 2025-15）

CONTACT US