Jiaxin Wen

I'm a CS PhD student at UC Berkeley. I'm also a part-time research scientist at Anthropic. I finished my undergrad and Master at Tsinghua University.

I define the right objectives for eliciting and aligning superhuman AI.

Email  /  Google Scholar  /  Github  /  Twitter

profile photo

Selected Papers

Language Models Learn to Mislead Humans via RLHF
Jiaxin Wen, Ruiqi Zhong, Akbir Khan, Ethan Perez, Jacob Steinhardt, Minlie Huang, Samuel R. Bowman, He He, Shi Feng
ICLR 2025
Predicting Empirical AI Research Outcomes with Language Models
Jiaxin Wen, Chenglei Si, Chen Yueh-han, He He, Shi Feng
NeurIPS 2025
Unsupervised Elicitation of Language Models
Jiaxin Wen, Zachary Ankner, Arushi Somani, Peter Hase, Samuel Marks, Jacob Goldman-Wetzler, Linda Petrini, Henry Sleight, Collin Burns, He He, Shi Feng, Ethan Perez, Jan Leike
preprint 2025
Learning Task Decomposition to Assist Humans in Competitive Programming
Jiaxin Wen, Ruiqi Zhong, Pei Ke, Zhihong Shao, Hongning Wang, Minlie Huang
ACL 2024

Selected Honors

  • Outstanding Graduate Thesis (Top-1), Tsinghua University   2025
  • Beijing Outstanding Graduate (Top-1), Tsinghua University   2025
  • Outstanding Undergraduate Thesis (Top-5), Tsinghua University   2022
  • Outstanding Graduate, Department of Computer Science and Technology, Tsinghua University   2022

  • Website design from Jon Barron.