My name is

Jiwoo Hong

I am a graduate student at KAIST AI, specializing in AI and NLP under the supervision of Professor James Thorne. I graduated with the highest honors (Summa Cum Laude) with a Bachelor’s degree in Statistics and Industrial Engineering from SungKyunKwan University. My research interests focus on generalizability in post-training, including RLHF, RLVR, and reward modeling. Please visit ‘Publications’ to check my recent works!

Experience

Applied Scientist Intern - Amazon Rufus
Summer 2025 (Incoming)
I am an incoming applied scientist intern in Amazon Rufus team, at Palo Alto, CA. My research focus at Amazon Rufus would be the intersection of multi-objective optimization and reinforcement learning in language models.
AI Research Intern - Naver Cloud
Feb 2025 - Present
I am currently working as an NLP research intern in the post-training team at Naver Cloud, focusing on RLHF and RLVR for language models.

Publications

ORPO: Monolithic Preference Optimization without Reference Model
Alignment RLHF
ORPO: Monolithic Preference Optimization without Reference Model
Jiwoo Hong, Noah Lee, James Thorne
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning
LLM Reasoning Generalizability
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning
Guijin Son, Jiwoo Hong, Hyunwoo Ko, James Thorne
AlphaPO - Reward shape matters for LLM alignment
Alignment RLHF
AlphaPO - Reward shape matters for LLM alignment
Aman Gupta, Shao Tang, Qingquan Song, Sirou Zhu, Jiwoo Hong, and 8 more authors
Cross-lingual Transfer of Reward Models in Multilingual Alignment
RLHF Reward Models Generalizability
Cross-lingual Transfer of Reward Models in Multilingual Alignment
Jiwoo Hong*, Noah Lee*, Rodrigo Martínez-Castaño, César Rodríguez, James Thorne
Stable Language Model Pre-training by Reducing Embedding Variability
Pre-training Interpretability
Stable Language Model Pre-training by Reducing Embedding Variability
Woojin Chung, Jiwoo Hong, Na Min An, James Thorne, Se Young Yun
Disentangling Structure and Style: Political Bias Detection in News by Inducing Document Hierarchy
NLP Application Interpretability
Disentangling Structure and Style: Political Bias Detection in News by Inducing Document Hierarchy
Jiwoo Hong, Yejin Cho, Jiyoung Han, Jaemin Jung, James Thorne
Evaluating the Consistency of LLM Evaluators
Evaluation RLHF
Evaluating the Consistency of LLM Evaluators
Noah Lee*, Jiwoo Hong*, James Thorne
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
Alignment Diffusion
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
Jiwoo Hong*, Sayak Paul*, Noah Lee, Kashif Rasul, James Thorne, Jongheon Jeong

Contact

Any contacts related to my works or research collaboration are always welcome!