Academic Report of School of Mathematical Sciences [2026] No. 004
(Series Report for High-Level University Construction No. 1263)
Title:Risk-sensitive Reinforcement Learning Based on Convex Scoring Functions
Speaker:Assistant Professor Yang Liu (The Chinese University of Hong Kong, Shenzhen)
Time:14:15-15:15, Jan. 13, 2026
Location:Huiwen Building 2331
Abstract:We propose a reinforcement learning (RL) framework under a broad class of risk objectives, characterized by convex scoring functions. This class covers many common risk measures, such as variance, Expected Shortfall, entropic Value-at-Risk, and mean-risk utility. To resolve the time-inconsistency issue, we consider an augmented state space and an auxiliary variable and recast the problem as a two-state optimization problem. We propose a customized Actor-Critic algorithm and establish some theoretical approximation guarantees. A key theoretical contribution is that our results do not require the Markov decision process to be continuous. Additionally, we propose an auxiliary variable sampling method inspired by the alternating minimization algorithm, which is convergent under certain conditions. We validate our approach in simulation experiments with a financial application in statistical arbitrage trading, demonstrating the effectiveness of the algorithm. This joint work is with Shanyu Han and Xiang Yu.
Speaker Profile:Dr. Yang Liu currently serves as Assistant Professor at the School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen. He obtained his Bachelor's degree in Mathematics from Tsinghua University in 2016 and his Ph.D. in Mathematics from the same institution in 2021. Subsequently, he held postdoctoral research positions at the University of Waterloo and Stanford University. His research spans financial mathematics, applied probability, operations research, actuarial science, and reinforcement learning. Dr. Liu Yang focuses on studying decision-making under complexity and uncertainty in financial, insurance, and reinforcement learning systems, specifically addressing non-concave/non-convex utility portfolio optimization, robust risk aggregation under dependent uncertainty, and risk measurement. His research findings have been published in prestigious academic journals within the field, including Operations Research, Mathematical Finance, Finance and Stochastics, SIAM Journal on Control and Optimization, and Insurance: Mathematics and Economics. In 2024, he received the First Prize for Best Paper by a Young Scholar at the 13th Annual Conference of the Financial Engineering and Financial Risk Management Branch of the Chinese Operations Research Society.
Faculty and students are welcome to attend!
Invited by: Hailing Dong
School of Mathematical Sciences
January 8, 2026