In [ ]:
pip install optuna/sigopt/wandb/ray[tune]
In [ ]:
def sigopt_hp_space(trial):
return [
{"bounds": {"min": 1e-6, "max": 1e-4}, "name": "learning_rate", "type": "double"},
{
"categorical_values": ["16", "32", "64", "128"],
"name": "per_device_train_batch_size",
"type": "categorical",
},
]
Optuna¶
参考 Optuna 的 object_parameter,如下所示:
In [ ]:
def optuna_hp_space(trial):
return {
"learning_rate": trial.suggest_float("learning_rate", 1e-6, 1e-4, log=True),
"per_device_train_batch_size": trial.suggest_categorical("per_device_train_batch_size", [16, 32, 64, 128]),
}
Optuna 支持多目标超参数优化(HPO)。您可以在 hyperparameter_search 中传递 direction 参数,并定义自己的 compute_objective 函数来返回多个目标值。hyperparameter_search 将返回 Pareto 前沿(List[BestRun]),您可以参考 test_trainer 中的测试用例 TrainerHyperParameterMultiObjectOptunaIntegrationTest。如下所示:
In [ ]:
best_trials = trainer.hyperparameter_search(
direction=["minimize", "maximize"],
backend="optuna",
hp_space=optuna_hp_space,
n_trials=20,
compute_objective=compute_objective,
)
Ray Tune¶
参考 Ray Tune 的 object_parameter,如下所示:
In [ ]:
def ray_hp_space(trial):
return {
"learning_rate": tune.loguniform(1e-6, 1e-4),
"per_device_train_batch_size": tune.choice([16, 32, 64, 128]),
}
Weights & Biases (W&B)¶
参考 W&B 的 object_parameter,如下所示:
In [ ]:
def wandb_hp_space(trial):
return {
"method": "random",
"metric": {"name": "objective", "goal": "minimize"},
"parameters": {
"learning_rate": {"distribution": "uniform", "min": 1e-6, "max": 1e-4},
"per_device_train_batch_size": {"values": [16, 32, 64, 128]},
},
}
In [ ]:
def model_init(trial):
return AutoModelForSequenceClassification.from_pretrained(
model_args.model_name_or_path,
from_tf=bool(".ckpt" in model_args.model_name_or_path),
config=config,
cache_dir=model_args.cache_dir,
revision=model_args.model_revision,
token=True if model_args.use_auth_token else None,
)
In [ ]:
trainer = Trainer(
model=None,
args=training_args,
train_dataset=small_train_dataset,
eval_dataset=small_eval_dataset,
compute_metrics=compute_metrics,
processing_class=tokenizer,
model_init=model_init,
data_collator=data_collator,
)
调用超参数搜索¶
调用超参数搜索,获取最佳试验参数。后端可以是 "optuna"、"sigopt"、"wandb" 或 "ray"。direction 可以是 "minimize" 或 "maximize",表示要优化更大的目标还是更小的目标。
您可以定义自己的 compute_objective 函数,如果不定义,将调用默认的 compute_objective 函数,返回评估指标(如 F1 值)的总和作为目标值。
In [ ]:
best_trial = trainer.hyperparameter_search(
direction="maximize",
backend="optuna",
hp_space=optuna_hp_space,
n_trials=20,
compute_objective=compute_objective,
)
适用于 DDP 微调的超参数搜索¶
目前,适用于 DDP 的超参数搜索仅支持 Optuna 和 SigOpt。只有排名为零的进程会生成搜索试验并将参数传递给其他进程。