Language Models Learn to Mislead Humans via RLHF
Jiaxin Wen,
Ruiqi Zhong,
Akbir Khan,
Ethan Perez,
Jacob Steinhardt,
Minlie Huang,
Samuel R. Bowman,
He He,
Shi Feng
preprint
[paper]
|
Learning Task Decomposition to Assist Humans in Competitive Programming
Jiaxin Wen,
Ruiqi Zhong,
Pei Ke,
Zhihong Shao,
Hongning Wang,
Minlie Huang
ACL 2024
[paper]
[poster]
|
Unveiling the Implicit Toxicity in Large Language Models
Jiaxin Wen,
Pei Ke,
Hao Sun,
Zhexin Zhang,
Changfei Li,
Jinfeng Bai,
Minlie Huang
EMNLP 2023
[paper]
[code]
|
CodePlan: Unlocking Reasoning Potential in Large Language Models by Scaling Code-form Planning
Jiaxin Wen*,
Jian Guan*,
Hongning Wang,
Wei Wu,
Minlie Huang
preprint
[paper]
|
AdaptiveBackdoor: Backdoored Language Model Agents that Detect Human Overseers
Heng Wang,
Ruiqi Zhong,
Jiaxin Wen
Jacob Steinhardt
ICML 2024 Next Generation of AI Safety Workshop
|
ETHICIST: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation
Zhexin Zhang,
Jiaxin Wen,
Minlie Huang
ACL2023
[paper]
[code]
|
Re3Dial: Retrieve, Reorganize and Rescale Dialogue Corpus for Long-Turn Open-Domain Dialogue Pre-training
Jiaxin Wen,
Hao Zhou,
Jian Guan,
Minlie Huang
EMNLP 2023
[paper]
[code]
|
AUGESC: Dialogue Augmentation with Large Language Models for Emotional Support Conversation
Chujie Zheng,
Sahand Sabour,
Jiaxin Wen,
Zheng Zhang,
Minlie Huang
ACL2023 findings
[paper]
[code]
[dataset]
|
EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training
Yuxian Gu*,
Jiaxin Wen*,
Hao Sun*,
Yi Song, Pei Ke, Chujie Zheng, Zheng Zhang, Jianzhu Yao, Xiaoyan Zhu, Jie Tang, Minlie Huang
Machine Intelligence Research
[paper]
[code]
[poster]
|
AutoCAD: Automatically Generate Counterfactuals for Mitigating Shortcut Learning
Jiaxin Wen,
Yeshuang Zhu,
Jinchao Zhang,
Jie Zhou,
Minlie Huang
EMNLP2022 findings
[paper]
[code]
|
Persona-Guided Planning for Controlling the Protagonist’s Persona in Story Generation
Jiaxin Wen*,
Zhexin Zhang*,
Jian Guan,
Minlie Huang
NAACL 2022
[paper]
[code]
|
Robustness Testing of Language Understanding in Task-Oriented Dialog
Jiexi Liu*,
Ryuichi Takanobu*,
Jiaxin Wen,
Dazhen Wan, Hongguang Li, Weiran Nie, Cheng Li, Wei Peng, Minlie Huang
ACL 2021
[paper]
[code]
|
2024: ACL (Safety, LLM for Programming, Dialogue), COLM
2023: EMNLP (Dialogue, Safety)
2023: ACL (Large-scale Pre-training)
2022: EMNLP (Dialogue and Interactive Systems)
|
Mar. 2024 - Jun. 2024. Research Intern, LM Reasoning Team, Ant Research Group
Jun. 2023 - Nov. 2023. Research Intern, Foundation Model Team, Zhipu AI.
Jun. 2021 - Dec. 2021. Research Intern, WeChat AI Team, Tencent.
Jun. 2020 - Oct. 2020. Algorithm Intern, WeChat AI Team, Tencent.
|
I have passions for a wide variety of fields, and I am constantly exploring new areas and possibilities. Some of my major interests include
- Sports: I enjoy
body building CrossFit recently. I'm aiming to run my first (half-)marathon this year, although it's been a month since my last 30KM running practice because it took me a week to recover :(. I'm also the member of Tsinghua hiking club and rugby team.
- Literature: I have always been a reader in literature. I served as the teaching assistant for the course "Writing and Communication" in 2021. My favorite authors are Hermann Karl Hesse, Albert Camus, and Jerome David Salinger.
|
|