Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
train-dpo.sh	train-dpo.sh
train-sft-stage1.sh	train-sft-stage1.sh
train-sft-stage2.sh	train-sft-stage2.sh
verl-grpo.sh	verl-grpo.sh

Name

Last commit message

Last commit date

README.md

Update: GRPO scripts with verl

see verl-grpo.sh and our tech report.

Training scripts with 360-LLaMA-Factory

Usage:

follow installation of 360-LLaMA-Factory
place e.g. train-dpo.sh in your git-cloned 360-LLaMA-Factory's root directory (same hierarchy as 360-example.sh)
register your dataset (e.g. Light-R1-DPO) in dataset_info.json

  "light-r1-dpo": {
    "file_name": "/path/to/dpo-pairs.json",
    "ranking": true,
    "formatting": "sharegpt",
    "columns": {
      "messages": "conversations",
      "chosen": "chosen",
      "rejected": "rejected"
    }
  },

fill in the missing arguments in train-dpo.sh and sh train-dpo.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Update: GRPO scripts with verl

Training scripts with 360-LLaMA-Factory

FilesExpand file tree

train-scripts

Directory actions

More options

Directory actions

More options

Latest commit

History

train-scripts

Folders and files

parent directory

README.md

Update: GRPO scripts with verl

Training scripts with 360-LLaMA-Factory