Our paper, ORPO: Monolithic Preference Optimization without Reference Model with Noah Lee, is uploaded to Arxiv! Our best models, 🤗 Mistral-ORPO-$\alpha$ (7B) and 🤗 Mistral-ORPO-$\beta$ (7B), surpa...
- 1
- 1 / 1
A new version of content is available.