Zhongzhu / Charlie
Home
Research
Publication
Experience
Recent News
Blog
CV
↗
Tag
#
RLHF
10 posts tagged with this label. Back to
all tags
or the
main feed
.
2026
04-14
EN
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts — Deep Technical Review
04-14
中
ArmoRM:用“多目标奖励建模 + 混合专家门控”做可解释偏好学习——深度阅读笔记
04-07
EN
ORPO: Monolithic Preference Optimization without Reference Model — In-Depth Technical Review
04-07
中
ORPO:不用参考模型的一体化偏好优化 — 深度阅读笔记
03-31
EN
Constitutional AI: Harmlessness from AI Feedback — In-Depth Technical Review
03-24
EN
Proximal Policy Optimization Algorithms — In-Depth Technical Review
03-24
中
近端策略优化算法(PPO)— 深度阅读笔记
03-12
EN
PaRO: Smarter Partitioning for Distributed Training — Beyond ZeRO's One-Size-Fits-All
03-10
EN
InstructGPT: The RLHF Recipe That Turned GPT-3 Into a Helpful Assistant
02-17
EN
Direct Preference Optimization: Your Language Model Is Secretly a Reward Model — Technical Review