Skip to content

Commit d039ad2

Browse files
authored
fix dpo pp criterion (#9786)
1 parent 13053a7 commit d039ad2

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

llm/alignment/dpo/run_dpo.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,8 @@ def main():
127127

128128
if training_args.pipeline_parallel_degree > 1:
129129
model_class = AutoModelForCausalLMPipe
130+
if not dpo_config.reference_free and not dpo_config.lora:
131+
ref_model_config.dpo_config = dpo_config
130132
model_config.dpo_config = dpo_config
131133
else:
132134
model_class = AutoModelForCausalLM

0 commit comments

Comments
 (0)