Jiaxin Wen

I'm a final-year Master's student at Tsinghua University. At Tsinghua, I'm working with Minlie Huang and Hongning Wang.

I also work with researchers from UCB (Ruiqi Zhong), NYU (He He, Shi Feng), and Anthropic (Ethan Perez, Akbir Khan).

I will graduate in 2025 and am seeking PhD or research scientist positions in superalignment. Please reach out if you think I would be a good fit!

Email  /  CV  /  Google Scholar  /  Github  /  Twitter

profile photo

Research Overview

I want to supervise AI systems on challenging tasks beyond human reach.

I'm particualrly interested in improving human supervision.

Recent research:

At my early career, I worked on improving robustness, long-context modeling and planning. I also (co-)led the development of multiple pre-trained LMs (EVA, OPD), and demos (Empathetic chatbot, Role-play chatbot, and ChatGLM3 Code Interpreter), which got millions of online queries.

Selected Papers

Language Models Learn to Mislead Humans via RLHF
Jiaxin Wen, Ruiqi Zhong, Akbir Khan, Ethan Perez, Jacob Steinhardt, Minlie Huang, Samuel R. Bowman, He He, Shi Feng
preprint
[paper]
Learning Task Decomposition to Assist Humans in Competitive Programming
Jiaxin Wen, Ruiqi Zhong, Pei Ke, Zhihong Shao, Hongning Wang, Minlie Huang
ACL 2024
[paper] [poster]
Unveiling the Implicit Toxicity in Large Language Models
Jiaxin Wen, Pei Ke, Hao Sun, Zhexin Zhang, Changfei Li, Jinfeng Bai, Minlie Huang
EMNLP 2023
[paper] [code]

Others

CodePlan: Unlocking Reasoning Potential in Large Language Models by Scaling Code-form Planning
Jiaxin Wen*, Jian Guan*, Hongning Wang, Wei Wu, Minlie Huang
preprint
[paper]
AdaptiveBackdoor: Backdoored Language Model Agents that Detect Human Overseers
Heng Wang, Ruiqi Zhong, Jiaxin Wen Jacob Steinhardt
ICML 2024 Next Generation of AI Safety Workshop
ETHICIST: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation
Zhexin Zhang, Jiaxin Wen, Minlie Huang
ACL2023
[paper] [code]
Re3Dial: Retrieve, Reorganize and Rescale Dialogue Corpus for Long-Turn Open-Domain Dialogue Pre-training
Jiaxin Wen, Hao Zhou, Jian Guan, Minlie Huang
EMNLP 2023
[paper] [code]
AUGESC: Dialogue Augmentation with Large Language Models for Emotional Support Conversation
Chujie Zheng, Sahand Sabour, Jiaxin Wen, Zheng Zhang, Minlie Huang
ACL2023 findings
[paper] [code] [dataset]
EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training
Yuxian Gu*, Jiaxin Wen*, Hao Sun*, Yi Song, Pei Ke, Chujie Zheng, Zheng Zhang, Jianzhu Yao, Xiaoyan Zhu, Jie Tang, Minlie Huang
Machine Intelligence Research
[paper] [code] [poster]
AutoCAD: Automatically Generate Counterfactuals for Mitigating Shortcut Learning
Jiaxin Wen, Yeshuang Zhu, Jinchao Zhang, Jie Zhou, Minlie Huang
EMNLP2022 findings
[paper] [code]
Persona-Guided Planning for Controlling the Protagonist’s Persona in Story Generation
Jiaxin Wen*, Zhexin Zhang*, Jian Guan, Minlie Huang
NAACL 2022
[paper] [code]
Robustness Testing of Language Understanding in Task-Oriented Dialog
Jiexi Liu*, Ryuichi Takanobu*, Jiaxin Wen, Dazhen Wan, Hongguang Li, Weiran Nie, Cheng Li, Wei Peng, Minlie Huang
ACL 2021
[paper] [code]

Service

  • 2024: ACL (Safety, LLM for Programming, Dialogue), COLM
  • 2023: EMNLP (Dialogue, Safety)
  • 2023: ACL (Large-scale Pre-training)
  • 2022: EMNLP (Dialogue and Interactive Systems)
  • Experiences

  • Mar. 2024 - Jun. 2024. Research Intern, LM Reasoning Team, Ant Research Group
  • Jun. 2023 - Nov. 2023. Research Intern, Foundation Model Team, Zhipu AI.
  • Jun. 2021 - Dec. 2021. Research Intern, WeChat AI Team, Tencent.
  • Jun. 2020 - Oct. 2020. Algorithm Intern, WeChat AI Team, Tencent.
  • Awards

  • Global AI Innovation Contest (6nd out of 5000)   2023
  • Outstanding Undergraduate Thesis, Tsinghua University (Top-5 score in the DCST)   2022
  • Outstanding Graduate, Department of Computer Science and Technology, Tsinghua University   2022
  • Global AI Innovation Contest Runner-up (2nd out of 5000)   2021
  • Tsinghua Academic Excellence Award   2020
  • Tsinghua Volunteer Excellence Award   2019-2020
  • Tsinghua Philobiblion Award   2019
  • Miscellaneous

    I have passions for a wide variety of fields, and I am constantly exploring new areas and possibilities. Some of my major interests include
    • Sports: I enjoy body building CrossFit recently. I'm aiming to run my first (half-)marathon this year, although it's been a month since my last 30KM running practice because it took me a week to recover :(. I'm also the member of Tsinghua hiking club and rugby team.
    • Literature: I have always been a reader in literature. I served as the teaching assistant for the course "Writing and Communication" in 2021. My favorite authors are Hermann Karl Hesse, Albert Camus, and Jerome David Salinger.

    Website design from Jon Barron.