基本概念

在使用LLaMA Factory之前，用户可能需要了解该产品所涉及的概念。本文为用户介绍LLaMA Factory的基本概念，以便于用户更好地使用LLaMA Factory。

WebUI

LLaMA-Factory 支持通过 WebUI 零代码微调大语言模型。您可以通过llamafactory-cli webui指令进入 WebUI，WebUI主要分为四个界面：训练、评估与预测、对话、导出。

训练

在开始训练模型之前，您需要指定参数，涉及的参数有：模型名称及路径、训练阶段、微调方法、训练数据集、学习率、训练轮数、输出目录及配置路径、微调参数等其他参数。点击“开始”按钮开始训练模型。

alt text

评估与预测

模型训练完毕后，您可通过在评估与预测界面通过指定的数据路径、截断长度、Top-p采样值、温度系数、输出目录等参数。

alt text

对话

您可通过在对话界面指定推理引擎、推理数据类型后，输入对话内容与模型进行对话观察效果。

alt text

导出

如果您对模型效果满意并需要导出模型，您可在导出界面通过指定最大分块大小、导出量化等级数据集、导出设备和目录等参数后，点击导出按钮导出模型。

alt text

数据处理

dataset_info.json 包含了所有经过预处理的本地数据集以及在线数据集。LLaMA Factory 支持Alpaca格式和ShareGPT格式的数据集。

指令监督微调数据集

指令监督微调(Instruct Tuning)通过让模型学习详细的指令以及对应的回答来优化模型在特定指令下的表现。instruction列对应的内容为人类指令，input列对应的内容为人类输入，output列对应的内容为模型回答。

预训练数据集

预训练数据集文本描述格式如下：

 {"text": "document"},
 {"text": "document"}

对于上述格式的数据， dataset_info.json 中的数据集描述应为：

dataset_info.json
{
  "ConversationsDataset": {
    "file_name": "data.json",
    "columns": {
      "prompt": "text"
    },
    "description": "这是一个包含人类与AI之间对话的数据集。",
    "format": "json",
    "size": "1GB",
    "last_updated": "2025-05-08",
    "version": "1.0"
  }
}

偏好数据集

文本描述格式如下：

{
    "instruction": "人类指令（必填）",
    "input": "人类输入（选填）",
    "chosen": "优质回答（必填）",
    "rejected": "劣质回答（必填）"
}

对于上述格式的数据，dataset_info.json 中的数据集描述应为：

dataset_info.json
"数据集名称": {
  "file_name": "data.json",
  "ranking": true,
  "columns": {
    "prompt": "instruction",
    "query": "input",
    "chosen": "chosen",
    "rejected": "rejected"
  }
}

KTO数据集

在一轮问答中其格式如下：

 {
      "instruction": "人类指令（必填）",
      "input": "人类输入（选填）",
      "output": "模型回答（必填）",
      "kto_tag": "人类反馈 [true/false]（必填）"
  }

对于上述格式的数据， dataset_info.json 中的数据集描述应为：

{
 "数据集名称": {
   "file_name": "data.json",
   "columns": {
     "prompt": "instruction",
     "query": "input",
     "response": "output",
     "kto_tag": "kto_tag"
   }
 }
}

多模态数据集

目前支持多模态图像数据集、视频数据集以及音频数据集的输入。这里以图像数据集格式为例进行展示和说明，视频数据集和音频数据集数据格式类似。多模态图像数据集需要额外添加一个 images 列，包含输入图像的路径。注意图片的数量必须与文本中所有 “image”标记的数量严格一致。

     "instruction": "人类指令（必填）",
     "input": "人类输入（选填）",
     "output": "模型回答（必填）",
     "images": [
     "图像路径（必填）"
     ]
 }

对于上述格式的数据， dataset_info.json 中的数据集描述应为：

{
  "数据集名称": {
    "file_name": "data.json",
    "columns": {
      "prompt": "instruction",
      "query": "input",
      "response": "output",
      "images": "images"
    }
  }
}

指令监督微调数据集

指令监督微调数据集格式为：

{
  "conversations": [
    {
      "from": "human",
      "value": "人类指令"
    },
    {
      "from": "function_call",
      "value": "工具参数"
    },
    {
      "from": "observation",
      "value": "工具结果"
    },
    {
      "from": "gpt",
      "value": "模型回答"
    }
  ],
  "system": "系统提示词（选填）",
  "tools": "工具描述（选填）"
}

对于上述格式的数据， dataset_info.json 中的数据集描述应为：

{
  "数据集名称": {
    "file_name": "data.json",
    "formatting": "sharegpt",
    "columns": {
      "messages": "conversations",
      "system": "system",
      "tools": "tools"
    }
  }
}

偏好训练数据集

偏好数据集格式为：

{
"conversations": [
    {
        "from": "human",
        "value": "人类指令"
    },
    {
        "from": "gpt",
        "value": "模型回答"
    },
    {
        "from": "human",
        "value": "人类指令"
    }
],
"chosen": {
    "from": "gpt",
    "value": "优质回答"
},
"rejected": {
    "from": "gpt",
    "value": "劣质回答"
}
}

对于上述格式的数据，dataset_info.json 中的数据集描述应为：

"数据集名称": {
  "file_name": "data.json",
  "formatting": "sharegpt",
  "ranking": true,
  "columns": {
    "messages": "conversations",
    "chosen": "chosen",
    "rejected": "rejected"
  }
}

OpenAI格式

OpenAI 格式仅仅是 sharegpt 格式的一种特殊情况，其中第一条消息可能是系统提示词。

{
    "messages": [
    {
        "role": "system",
        "content": "系统提示词（选填）"
    },
    {
        "role": "user",
        "content": "人类指令"
    },
    {
        "role": "assistant",
        "content": "模型回答"
    }
    ]
}

对于上述格式的数据， dataset_info.json 中的数据集描述应为：

"数据集名称": {
  "file_name": "data.json",
  "formatting": "sharegpt",
  "columns": {
    "messages": "messages"
  },
  "tags": {
    "role_tag": "role",
    "content_tag": "content",
    "user_tag": "user",
    "assistant_tag": "assistant",
    "system_tag": "system"
  }
}

训练方法

LLaMA Factory 支持预训练（Pre-training）和后训练（Pre-training）。LLaMA Factory支持多种后训练（Pre-training）技术，包括：Supervised Fine-Tuning、RLHF（Reinforcement Learning from Human Feedback）、DPO、KTO（Kahneman-Taversky Optimization）等。

训练参数

下表为部分训练参数描述：

名称	描述
model_name_or_path	模型名称或路径
stage	训练阶段，可选: rm(reward modeling), pt(pretrain), sft(Supervised Fine-Tuning), PPO, DPO, KTO, ORPO
do_train	true用于训练, false用于评估
finetuning_type	微调方式。可选: freeze, lora, full
lora_target	采取LoRA方法的目标模块，默认值为 all
dataset	使用的数据集，使用”,”分隔多个数据集
template	数据集模板，请保证数据集模板与模型相对应
output_dir	输出路径
logging_steps	日志输出步数间隔
save_steps	模型断点保存间隔
overwrite_output_dir	是否允许覆盖输出目录
per_device_train_batch_size	每个设备上训练的批次大小
gradient_accumulation_steps	梯度积累步数
max_grad_norm	梯度裁剪阈值
learning_rate	学习率
lr_scheduler_type	学习率曲线，可选 linear, cosine, polynomial, constant 等
num_train_epochs	训练周期数
bf16	是否使用 bf16 格式
warmup_ratio	学习率预热比例
warmup_steps	学习率预热步数
push_to_hub	是否推送模型到 Huggingface

训练加速

LLaMA-Factory 支持多种加速技术，包括：FlashAttention、Unsloth、Liger Kernel。如果用户想体验加速技术,您可以在启动训练时在训练配置文件中添加以下参数：

# FlashAttention
flash_attn: fa2
# Unsloth
use_unsloth: True
# Liger Kernel
enable_liger_kernel: True

调优算法

LLaMA-Factory 支持多种调优算法，包括：Full Parameter Fine-tuning、Freeze、LoRA、Galore、BAdam。

分布式训练

LLaMA-Factory 支持单机多卡和多机多卡分布式训练，同时支持 DDP、DeepSpeed和FSDP三种分布式引擎。

合并

当基于预训练模型训练好 LoRA 适配器后，如您不希望在每次推理的时候分别加载预训练模型和 LoRA 适配器。因此需要将预训练模型和 LoRA 适配器合并导出成一个模型，并根据需要选择是否量化。

量化

量化通过数据精度压缩有效地减少了显存使用并加速推理。LLaMA-Factory 支持多种量化方法，包括:AQLM、AWQ、GPTQ、QLoRA等。

推理 LLaMA-Factory 支持多种推理方式，包括：原始模型推理、微调模型推理配置、多模态模型、批量推理。

原始模型推理

对于原始模型推理， inference_config.yaml 中只需指定原始模型 model_name_or_path 和 template 即可。

### examples/inference/llama3.yaml
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
template: llama3
infer_backend: huggingface #choices： [huggingface, vllm]

微调模型推理配置

对于微调模型推理，除原始模型和模板外，还需要指定适配器路径 adapter_name_or_path 和微调类型 finetuning_type。

### examples/inference/llama3_lora_sft.yaml
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
adapter_name_or_path: saves/llama3-8b/lora/sft
template: llama3
finetuning_type: lora
infer_backend: huggingface #choices： [huggingface, vllm]

多模态模型

对于多模态模型，您可以运行以下指令进行推理。

model_name_or_path: llava-hf/llava-1.5-7b-hf
template: vicuna
infer_backend: huggingface #choices： [huggingface, vllm]

通用能力评估

在完成模型训练后，您可以通过 llamafactory-cli eval examples/train_lora/llama3_lora_eval.yaml 来评估模型效果。

配置示例文件 examples/train_lora/llama3_lora_eval.yaml 具体如下：

### examples/train_lora/llama3_lora_eval.yaml
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
adapter_name_or_path: saves/llama3-8b/lora/sft # 可选项

### method
finetuning_type: lora

### dataset
task: mmlu_test # mmlu_test, ceval_validation, cmmlu_test
template: fewshot
lang: en
n_shot: 5

### output
save_dir: saves/llama3-8b/lora/eval

### eval
batch_size: 4

NLG评估

此外，用户还可以通过llamafactory-cli train examples/extras/nlg_eval/llama3_lora_predict.yaml来获得模型的BLEU和ROUGE分数以评价模型生成质量。

配置示例文件examples/extras/nlg_eval/llama3_lora_predict.yaml具体如下：

### examples/extras/nlg_eval/llama3_lora_predict.yaml
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
adapter_name_or_path: saves/llama3-8b/lora/sft

### method
stage: sft
do_predict: true
finetuning_type: lora

### dataset
eval_dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 2048
max_samples: 50
overwrite_cache: true
preprocessing_num_workers: 16

### output
output_dir: saves/llama3-8b/lora/predict
overwrite_output_dir: true

### eval
per_device_eval_batch_size: 1
predict_with_generate: true
ddp_timeout: 180000000

实验监控

LLaMA-Factory 支持多种训练可视化工具，包括：TensorBoard、Wandb、MLflow等。

如果您想使用WandbTensorBoard、MLflow等工具，您可以在WebUI的 其他参数设置 模块中的 启用外部记录面板 中开启：

alt text

如果您想使用 SwanLab，您可以在SwanLab 模块中开启 SwanLab 记录：

alt text

当然，用户可以通过在启动训练时在训练配置文件中添加以下参数来使用监控工具：

# 开启 SwanLab 
use_swanlab: true
swanlab_project: llamafactory
swanlab_run_name: test_run

# 开启 tensorboard
report_to: tensorboard

# 开启 wandb
report_to: wandb
run_name: test_run

# 开启 mlflow
report_to: mlflow

信息

关于license，请按照LLama Factory的版权要求使用，请参考该链接内容。