学习资源站

YOLOv11改进-损失函数篇-SlideLoss,FocalLoss,VFLoss分类损失函数助力细节涨点(全网最全)

一、本文介绍

本文给大家带来的是 分类损失 SlideLoss、VFLoss、FocalLoss 损失函数 ,我们之前看那的那些IoU都是边界框回归损失,和本文的修改内容并不冲突, 所以大家可以知道损失函数分为两种一种是分类损失另一种是边界框回归损失 ,上一篇文章里面我们总结了过去百分之九十的边界框回归损失的使用方法,本文我们就来介绍几种市面上流行的和最新的分类损失函数, 同时在开始讲解之前推荐一下我的专栏 本专栏的内容支持(分类、检测、分割、追踪、关键点检测),专栏目前为限时折扣, 欢迎大家订阅本专栏,本专栏每周更新3-5篇最新机制,更有包含我所有改进的文件和交流群提供给大家,本文支持的损失函数共有如下图片所示

欢迎大家订阅我的专栏一起学习YOLO!



二、原理介绍

其中绝大多数损失在前面我们都讲过了本文主要讲一下SlidLoss的原理,SlideLoss的损失首先是由YOLO-FaceV2提出来的。

​​

官方论文地址: 官方论文地址点击即可跳转

官方代码地址: 官方代码地址点击即可跳转

​​


从摘要上我们可以看出SLideLoss的出现是通过权重函数来解决简单和困难样本之间的不平衡问题题,什么是简单样本和困难样本?

样本不平衡问题是一个常见的问题,尤其是在分类和 目标检测 任务中。它通常指的是训练数据集中不同类别的样本数量差异很大。对于人脸检测这样的任务来说,简单样本和困难样本之间的不平衡问题可以具体描述如下:

简单样本:

  • 容易被模型正确识别的样本。
  • 通常出现在数据集中的数量较多。
  • 特征明显,分类或检测边界清晰。
  • 在训练中,这些样本会给出较低的损失值,因为模型可以轻易地正确预测它们。

困难样本:

  • 模型难以正确识别的样本。
  • 在数据集中相对较少,但对模型性能的提升至关重要。
  • 可能由于多种原因变得难以识别,如遮挡、变形、模糊、光照变化、小尺寸或者与背景的低对比度。
  • 在训练中,这些样本会产生较高的损失值,因为模型很难对它们给出准确的预测。

解决样本不平衡的问题是提高 模型 泛化 能力的关键。如果模型大部分只见过简单样本,它可能在实际应用中遇到困难样本时性能下降。因此采用各种策略来解决这个问题,例如重采样(对困难样本进行过采样或对简单样本进行欠采样)、修改损失函数(给困难样本更高的权重),或者是设计新的模型结构来专门关注困难样本。在YOLO-FaceV2中,作者通过Slide Loss这样的权重函数来让模型在训练过程中更关注那些困难样本 (这也是本文的修改内容)


三、核心代码

使用方式看章节四

  1. from .tal import bbox2dist
  2. import torch.nn.functional as F
  3. import math
  4. class QualityfocalLoss(nn.Module):
  5. def __init__(self, beta=2.0):
  6. super().__init__()
  7. self.beta = beta
  8. def forward(self, pred_score, gt_score, gt_target_pos_mask):
  9. # negatives are supervised by 0 quality score
  10. pred_sigmoid = pred_score.sigmoid()
  11. scale_factor = pred_sigmoid
  12. zerolabel = scale_factor.new_zeros(pred_score.shape)
  13. with torch.cuda.amp.autocast(enabled=False):
  14. loss = F.binary_cross_entropy_with_logits(pred_score, zerolabel, reduction='none') * scale_factor.pow(
  15. self.beta)
  16. scale_factor = gt_score[gt_target_pos_mask] - pred_sigmoid[gt_target_pos_mask]
  17. with torch.cuda.amp.autocast(enabled=False):
  18. loss[gt_target_pos_mask] = F.binary_cross_entropy_with_logits(pred_score[gt_target_pos_mask],
  19. gt_score[gt_target_pos_mask],
  20. reduction='none') * scale_factor.abs().pow(
  21. self.beta)
  22. return loss
  23. class SlideLoss(nn.Module):
  24. def __init__(self, loss_fcn):
  25. super(SlideLoss, self).__init__()
  26. self.loss_fcn = loss_fcn
  27. self.reduction = loss_fcn.reduction
  28. self.loss_fcn.reduction = 'none' # required to apply SL to each element
  29. def forward(self, pred, true, auto_iou=0.5):
  30. loss = self.loss_fcn(pred, true)
  31. if auto_iou < 0.2:
  32. auto_iou = 0.2
  33. b1 = true <= auto_iou - 0.1
  34. a1 = 1.0
  35. b2 = (true > (auto_iou - 0.1)) & (true < auto_iou)
  36. a2 = math.exp(1.0 - auto_iou)
  37. b3 = true >= auto_iou
  38. a3 = torch.exp(-(true - 1.0))
  39. modulating_weight = a1 * b1 + a2 * b2 + a3 * b3
  40. loss *= modulating_weight
  41. if self.reduction == 'mean':
  42. return loss.mean()
  43. elif self.reduction == 'sum':
  44. return loss.sum()
  45. else: # 'none'
  46. return loss
  47. class Focal_Loss(nn.Module):
  48. # Wraps focal loss around existing loss_fcn(), i.e. criteria = FocalLoss(nn.BCEWithLogitsLoss(), gamma=1.5)
  49. def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
  50. super().__init__()
  51. self.loss_fcn = loss_fcn # must be nn.BCEWithLogitsLoss()
  52. self.gamma = gamma
  53. self.alpha = alpha
  54. self.reduction = loss_fcn.reduction
  55. self.loss_fcn.reduction = 'none' # required to apply FL to each element
  56. def forward(self, pred, true):
  57. loss = self.loss_fcn(pred, true)
  58. # p_t = torch.exp(-loss)
  59. # loss *= self.alpha * (1.000001 - p_t) ** self.gamma # non-zero power for gradient stability
  60. # TF implementation https://github.com/tensorflow/addons/blob/v0.7.1/tensorflow_addons/losses/focal_loss.py
  61. pred_prob = torch.sigmoid(pred) # prob from logits
  62. p_t = true * pred_prob + (1 - true) * (1 - pred_prob)
  63. alpha_factor = true * self.alpha + (1 - true) * (1 - self.alpha)
  64. modulating_factor = (1.0 - p_t) ** self.gamma
  65. loss *= alpha_factor * modulating_factor
  66. if self.reduction == 'mean':
  67. return loss.mean()
  68. elif self.reduction == 'sum':
  69. return loss.sum()
  70. else: # 'none'
  71. return loss
  72. def reduce_loss(loss, reduction):
  73. """Reduce loss as specified.
  74. Args:
  75. loss (Tensor): Elementwise loss tensor.
  76. reduction (str): Options are "none", "mean" and "sum".
  77. Return:
  78. Tensor: Reduced loss tensor.
  79. """
  80. reduction_enum = F._Reduction.get_enum(reduction)
  81. # none: 0, elementwise_mean:1, sum: 2
  82. if reduction_enum == 0:
  83. return loss
  84. elif reduction_enum == 1:
  85. return loss.mean()
  86. elif reduction_enum == 2:
  87. return loss.sum()
  88. def weight_reduce_loss(loss, weight=None, reduction='mean', avg_factor=None):
  89. """Apply element-wise weight and reduce loss.
  90. Args:
  91. loss (Tensor): Element-wise loss.
  92. weight (Tensor): Element-wise weights.
  93. reduction (str): Same as built-in losses of PyTorch.
  94. avg_factor (float): Avarage factor when computing the mean of losses.
  95. Returns:
  96. Tensor: Processed loss values.
  97. """
  98. # if weight is specified, apply element-wise weight
  99. if weight is not None:
  100. loss = loss * weight
  101. # if avg_factor is not specified, just reduce the loss
  102. if avg_factor is None:
  103. loss = reduce_loss(loss, reduction)
  104. else:
  105. # if reduction is mean, then average the loss by avg_factor
  106. if reduction == 'mean':
  107. loss = loss.sum() / avg_factor
  108. # if reduction is 'none', then do nothing, otherwise raise an error
  109. elif reduction != 'none':
  110. raise ValueError('avg_factor can not be used with reduction="sum"')
  111. return loss
  112. def varifocal_loss(pred,
  113. target,
  114. weight=None,
  115. alpha=0.75,
  116. gamma=2.0,
  117. iou_weighted=True,
  118. reduction='mean',
  119. avg_factor=None):
  120. """`Varifocal Loss <https://arxiv.org/abs/2008.13367>`_
  121. Args:
  122. pred (torch.Tensor): The prediction with shape (N, C), C is the
  123. number of classes
  124. target (torch.Tensor): The learning target of the iou-aware
  125. classification score with shape (N, C), C is the number of classes.
  126. weight (torch.Tensor, optional): The weight of loss for each
  127. prediction. Defaults to None.
  128. alpha (float, optional): A balance factor for the negative part of
  129. Varifocal Loss, which is different from the alpha of Focal Loss.
  130. Defaults to 0.75.
  131. gamma (float, optional): The gamma for calculating the modulating
  132. factor. Defaults to 2.0.
  133. iou_weighted (bool, optional): Whether to weight the loss of the
  134. positive example with the iou target. Defaults to True.
  135. reduction (str, optional): The method used to reduce the loss into
  136. a scalar. Defaults to 'mean'. Options are "none", "mean" and
  137. "sum".
  138. avg_factor (int, optional): Average factor that is used to average
  139. the loss. Defaults to None.
  140. """
  141. # pred and target should be of the same size
  142. assert pred.size() == target.size()
  143. pred_sigmoid = pred.sigmoid()
  144. target = target.type_as(pred)
  145. if iou_weighted:
  146. focal_weight = target * (target > 0.0).float() + \
  147. alpha * (pred_sigmoid - target).abs().pow(gamma) * \
  148. (target <= 0.0).float()
  149. else:
  150. focal_weight = (target > 0.0).float() + \
  151. alpha * (pred_sigmoid - target).abs().pow(gamma) * \
  152. (target <= 0.0).float()
  153. loss = F.binary_cross_entropy_with_logits(
  154. pred, target, reduction='none') * focal_weight
  155. loss = weight_reduce_loss(loss, weight, reduction, avg_factor)
  156. return loss
  157. class Vari_focalLoss(nn.Module):
  158. def __init__(self,
  159. use_sigmoid=True,
  160. alpha=0.75,
  161. gamma=2.0,
  162. iou_weighted=True,
  163. reduction='sum',
  164. loss_weight=1.0):
  165. """`Varifocal Loss <https://arxiv.org/abs/2008.13367>`_
  166. Args:
  167. use_sigmoid (bool, optional): Whether the prediction is
  168. used for sigmoid or softmax. Defaults to True.
  169. alpha (float, optional): A balance factor for the negative part of
  170. Varifocal Loss, which is different from the alpha of Focal
  171. Loss. Defaults to 0.75.
  172. gamma (float, optional): The gamma for calculating the modulating
  173. factor. Defaults to 2.0.
  174. iou_weighted (bool, optional): Whether to weight the loss of the
  175. positive examples with the iou target. Defaults to True.
  176. reduction (str, optional): The method used to reduce the loss into
  177. a scalar. Defaults to 'mean'. Options are "none", "mean" and
  178. "sum".
  179. loss_weight (float, optional): Weight of loss. Defaults to 1.0.
  180. """
  181. super(Vari_focalLoss, self).__init__()
  182. assert use_sigmoid is True, \
  183. 'Only sigmoid varifocal loss supported now.'
  184. assert alpha >= 0.0
  185. self.use_sigmoid = use_sigmoid
  186. self.alpha = alpha
  187. self.gamma = gamma
  188. self.iou_weighted = iou_weighted
  189. self.reduction = reduction
  190. self.loss_weight = loss_weight
  191. def forward(self,
  192. pred,
  193. target,
  194. weight=None,
  195. avg_factor=None,
  196. reduction_override=None):
  197. """Forward function.
  198. Args:
  199. pred (torch.Tensor): The prediction.
  200. target (torch.Tensor): The learning target of the prediction.
  201. weight (torch.Tensor, optional): The weight of loss for each
  202. prediction. Defaults to None.
  203. avg_factor (int, optional): Average factor that is used to average
  204. the loss. Defaults to None.
  205. reduction_override (str, optional): The reduction method used to
  206. override the original reduction method of the loss.
  207. Options are "none", "mean" and "sum".
  208. Returns:
  209. torch.Tensor: The calculated loss
  210. """
  211. assert reduction_override in (None, 'none', 'mean', 'sum')
  212. reduction = (
  213. reduction_override if reduction_override else self.reduction)
  214. if self.use_sigmoid:
  215. loss_cls = self.loss_weight * varifocal_loss(
  216. pred,
  217. target,
  218. weight,
  219. alpha=self.alpha,
  220. gamma=self.gamma,
  221. iou_weighted=self.iou_weighted,
  222. reduction=reduction,
  223. avg_factor=avg_factor)
  224. else:
  225. raise NotImplementedError
  226. return loss_cls


三、使用方式

3.1 修改一

我们找到如下的文件' ultralytics /utils/loss.py'然后将上面的核心代码粘贴到文件的开头位置 (注意是其他模块的导入之后!)粘贴后的样子如下图所示!


3.2 修改二

第二步我门中到函数class v8DetectionLoss:(没看错V10继承的v8损失函数我们修改v8就相当于修改了v10)!我们下下面的代码全部替换class v8DetectionLoss:的内容!

  1. class v8DetectionLoss:
  2. """Criterion class for computing training losses."""
  3. def __init__(self, model): # model must be de-paralleled
  4. """Initializes v8DetectionLoss with the model, defining model-related properties and BCE loss function."""
  5. device = next(model.parameters()).device # get model device
  6. h = model.args # hyperparameters
  7. m = model.model[-1] # Detect() module
  8. self.bce = nn.BCEWithLogitsLoss(reduction="none")
  9. "下面的代码注释掉就是正常的损失函数,如果不注释使用的就是使用对应的损失失函数"
  10. # self.bce = Focal_Loss(nn.BCEWithLogitsLoss(reduction='none')) # Focal
  11. # self.bce = Vari_focalLoss() # VFLoss
  12. # self.bce = SlideLoss(nn.BCEWithLogitsLoss(reduction='none')) # SlideLoss
  13. # self.bce = QualityfocalLoss() # 目前仅支持者目标检测需要注意 分割 Pose 等用不了!
  14. self.hyp = h
  15. self.stride = m.stride # model strides
  16. self.nc = m.nc # number of classes
  17. self.no = m.nc + m.reg_max * 4
  18. self.reg_max = m.reg_max
  19. self.device = device
  20. self.use_dfl = m.reg_max > 1
  21. self.assigner = TaskAlignedAssigner(topk=10, num_classes=self.nc, alpha=0.5, beta=6.0)
  22. self.bbox_loss = BboxLoss(m.reg_max).to(device)
  23. self.proj = torch.arange(m.reg_max, dtype=torch.float, device=device)
  24. def preprocess(self, targets, batch_size, scale_tensor):
  25. """Preprocesses the target counts and matches with the input batch size to output a tensor."""
  26. if targets.shape[0] == 0:
  27. out = torch.zeros(batch_size, 0, 5, device=self.device)
  28. else:
  29. i = targets[:, 0] # image index
  30. _, counts = i.unique(return_counts=True)
  31. counts = counts.to(dtype=torch.int32)
  32. out = torch.zeros(batch_size, counts.max(), 5, device=self.device)
  33. for j in range(batch_size):
  34. matches = i == j
  35. n = matches.sum()
  36. if n:
  37. out[j, :n] = targets[matches, 1:]
  38. out[..., 1:5] = xywh2xyxy(out[..., 1:5].mul_(scale_tensor))
  39. return out
  40. def bbox_decode(self, anchor_points, pred_dist):
  41. """Decode predicted object bounding box coordinates from anchor points and distribution."""
  42. if self.use_dfl:
  43. b, a, c = pred_dist.shape # batch, anchors, channels
  44. pred_dist = pred_dist.view(b, a, 4, c // 4).softmax(3).matmul(self.proj.type(pred_dist.dtype))
  45. # pred_dist = pred_dist.view(b, a, c // 4, 4).transpose(2,3).softmax(3).matmul(self.proj.type(pred_dist.dtype))
  46. # pred_dist = (pred_dist.view(b, a, c // 4, 4).softmax(2) * self.proj.type(pred_dist.dtype).view(1, 1, -1, 1)).sum(2)
  47. return dist2bbox(pred_dist, anchor_points, xywh=False)
  48. def __call__(self, preds, batch):
  49. """Calculate the sum of the loss for box, cls and dfl multiplied by batch size."""
  50. loss = torch.zeros(3, device=self.device) # box, cls, dfl
  51. feats = preds[1] if isinstance(preds, tuple) else preds
  52. pred_distri, pred_scores = torch.cat([xi.view(feats[0].shape[0], self.no, -1) for xi in feats], 2).split(
  53. (self.reg_max * 4, self.nc), 1
  54. )
  55. pred_scores = pred_scores.permute(0, 2, 1).contiguous()
  56. pred_distri = pred_distri.permute(0, 2, 1).contiguous()
  57. dtype = pred_scores.dtype
  58. batch_size = pred_scores.shape[0]
  59. imgsz = torch.tensor(feats[0].shape[2:], device=self.device, dtype=dtype) * self.stride[0] # image size (h,w)
  60. anchor_points, stride_tensor = make_anchors(feats, self.stride, 0.5)
  61. # Targets
  62. targets = torch.cat((batch["batch_idx"].view(-1, 1), batch["cls"].view(-1, 1), batch["bboxes"]), 1)
  63. targets = self.preprocess(targets.to(self.device), batch_size, scale_tensor=imgsz[[1, 0, 1, 0]])
  64. gt_labels, gt_bboxes = targets.split((1, 4), 2) # cls, xyxy
  65. mask_gt = gt_bboxes.sum(2, keepdim=True).gt_(0)
  66. # pboxes
  67. pred_bboxes = self.bbox_decode(anchor_points, pred_distri) # xyxy, (b, h*w, 4)
  68. target_labels, target_bboxes, target_scores, fg_mask, _ = self.assigner(
  69. pred_scores.detach().sigmoid(), (pred_bboxes.detach() * stride_tensor).type(gt_bboxes.dtype),
  70. anchor_points * stride_tensor, gt_labels, gt_bboxes, mask_gt)
  71. target_scores_sum = max(target_scores.sum(), 1)
  72. # Cls loss
  73. # loss[1] = self.varifocal_loss(pred_scores, target_scores, target_labels) / target_scores_sum # VFL way
  74. if isinstance(self.bce, (nn.BCEWithLogitsLoss, Vari_focalLoss, Focal_Loss)):
  75. loss[1] = self.bce(pred_scores, target_scores.to(dtype)).sum() / target_scores_sum # BCE VFLoss Focal
  76. elif isinstance(self.bce, SlideLoss):
  77. if fg_mask.sum():
  78. auto_iou = bbox_iou(pred_bboxes[fg_mask], target_bboxes[fg_mask], xywh=False, CIoU=True).mean()
  79. else:
  80. auto_iou = 0.1
  81. loss[1] = self.bce(pred_scores, target_scores.to(dtype), auto_iou).sum() / target_scores_sum # SlideLoss
  82. elif isinstance(self.bce, QualityfocalLoss):
  83. if fg_mask.sum():
  84. pos_ious = bbox_iou(pred_bboxes, target_bboxes / stride_tensor, xywh=False).clamp(min=1e-6).detach()
  85. # 10.0x Faster than torch.one_hot
  86. targets_onehot = torch.zeros((target_labels.shape[0], target_labels.shape[1], self.nc),
  87. dtype=torch.int64,
  88. device=target_labels.device) # (b, h*w, 80)
  89. targets_onehot.scatter_(2, target_labels.unsqueeze(-1), 1)
  90. cls_iou_targets = pos_ious * targets_onehot
  91. fg_scores_mask = fg_mask[:, :, None].repeat(1, 1, self.nc) # (b, h*w, 80)
  92. targets_onehot_pos = torch.where(fg_scores_mask > 0, targets_onehot, 0)
  93. cls_iou_targets = torch.where(fg_scores_mask > 0, cls_iou_targets, 0)
  94. else:
  95. cls_iou_targets = torch.zeros((target_labels.shape[0], target_labels.shape[1], self.nc),
  96. dtype=torch.int64,
  97. device=target_labels.device) # (b, h*w, 80)
  98. targets_onehot_pos = torch.zeros((target_labels.shape[0], target_labels.shape[1], self.nc),
  99. dtype=torch.int64,
  100. device=target_labels.device) # (b, h*w, 80)
  101. loss[1] = self.bce(pred_scores, cls_iou_targets.to(dtype), targets_onehot_pos.to(torch.bool)).sum() / max(
  102. fg_mask.sum(), 1)
  103. else:
  104. loss[1] = self.bce(pred_scores, target_scores.to(dtype)).sum() / target_scores_sum # 确保有损失可用
  105. # Bbox loss
  106. if fg_mask.sum():
  107. target_bboxes /= stride_tensor
  108. loss[0], loss[2] = self.bbox_loss(pred_distri, pred_bboxes, anchor_points, target_bboxes, target_scores,
  109. target_scores_sum, fg_mask,
  110. ((imgsz[0] ** 2 + imgsz[1] ** 2) / torch.square(stride_tensor)).repeat(1,
  111. batch_size).transpose(
  112. 1, 0))
  113. loss[0] *= self.hyp.box # box gain
  114. loss[1] *= self.hyp.cls # cls gain
  115. loss[2] *= self.hyp.dfl # dfl gain
  116. return loss.sum() * batch_size, loss.detach() # loss(box, cls, dfl)

3.3 使用方法

将上面的代码复制粘贴之后,我门找到下图所在的位置,使用方法就是那个取消注释就是使用的就是那个!

​​


四 、本文总结

到此本文的正式分享内容就结束了,在这里给大家推荐我的YOLOv11改进有效涨点专栏,本专栏目前为新开的平均质量分98分,后期我会根据各种最新的前沿顶会进行论文复现,也会对一些老的改进机制进行补充,如果大家觉得本文帮助到你了,订阅本专栏,关注后续更多的更新~

​​