unslothai/unsloth

[Docs] Update DPO example to use DPOConfig instead of TrainingArguments

Open

#4155 opened on Mar 4, 2026

View on GitHub
 (4 comments) (0 reactions) (0 assignees)Python (64,271 stars) (5,658 forks)batch import
good first issuehelp wanted

Description

Description

The current DPO example code in the documentation causes an AttributeError because it uses TrainingArguments from transformers instead of DPOConfig from trl. Recent versions of trl require DPOConfig for the DPOTrainer to properly handle DPO-specific arguments like padding_value.

URL: https://unsloth.ai/docs/get-started/reinforcement-learning-rl-guide/preference-dpo-orpo-and-kto

Error Log

Traceback (most recent call last):
  File "/workspace/work/main.py", line 74, in <module>
    dpo_trainer = DPOTrainer(
                  ^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/unsloth/trainer.py", line 314, in new_init
    original_init(self, *args, **kwargs)
  ...
  File "/workspace/work/unsloth_compiled_cache/UnslothDPOTrainer.py", line 903, in __init__
    if args.padding_value is not None:
       ^^^^^^^^^^^^^^^^^^
AttributeError: 'TrainingArguments' object has no attribute 'padding_value'

Environment

  • unsloth: 2026.3.3
  • trl: 0.23.1
  • transformers: 4.57.1

Suggested Fix

Replacing TrainingArguments with DPOConfig resolves the issue.

Current:

from transformers import TrainingArguments
...
dpo_trainer = DPOTrainer(
    model = model,
    args = TrainingArguments(
        ...
    ),
)

Proposed:

from trl import DPOConfig # Changed from TrainingArguments
...
dpo_trainer = DPOTrainer(
    model = model,
    args = DPOConfig( # Use DPOConfig
        per_device_train_batch_size = 4,
        ...
    ),
)

How can I contribute?

I would like to submit a Pull Request to update the documentation if this is acceptable. However, I couldn't find the source files for the documentation (GitBook) in this repository.

Could you please guide me on where the documentation source is located or how I should proceed with a PR?

Contributor guide