About
My name is Jiwoo Hong. I am an AI engineer at LinkedIn CoreAI. My research interests are centered on generalizability in post-training, including RLHF, RLVR,
reward modeling, and preference alignment. Please check "Publications" section for my recent works.
Experience
LinkedIn, Core AI
Generative AI solution for LinkedIn with language model post-training and agentic AI.
Amazon, Rufus
Multi-objective reinforcement learing alrogithms for language model post-training.
Naver Cloud, HyperClova
Large reasoning model research and core contributor to HyperCLOVA X THINK.
Publications
Bayesian Preference Learning for Test-Time Steerable Reward Models
Preprint
Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning
EACL'26
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
AAAI'26
HyperCLOVA X THINK Technical Report
Technical Report
On the Robustness of Reward Models for Language Model Alignment
ICML'25
AlphaPO: Reward Shape Matters for LLM Alignment
ICML'25
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning
ACL'25
When AI Co-Scientists Fail: SPOT, a Benchmark for Automated Verification of Scientific Research
Preprint
Cross-lingual Transfer of Reward Models in Multilingual Alignment
NAACL'25
Evaluating the Consistency of LLM Evaluators
COLING'25
ORPO: Monolithic Preference Optimization without Reference Model
EMNLP'24
Stable Language Model Pre-training by Reducing Embedding Variability
EMNLP'24
Disentangling Structure and Style: Political Bias Detection in News by Inducing Document Hierarchy
Findings of EMNLP'23
MARL-Based Dual Reward Model on Segmented Actions for Multiple Mobile Robots in Automated Warehouse Environment
Applied Sciences
Patent
Method and apparatus for voice profiling
KR Patent (ID: 1028832310000)
Academic Services
Conference Reviewer: TMLR, AAAI, ICLR, ICML, ACL Rolling Review (ARR), and NeurIPS
Invited Talks
ORPO: Monolithic Preference Optimization without Reference Model
Kakao Brain
ORPO: Monolithic Preference Optimization without Reference Model
KISTI
Resource-friendly single-step language model alignment with ORPO
Twelve Labs talk