学习资源站

28-添加SCConv注意力卷积(CVPR 2023_即插即用的高效卷积模块)_yolov5加入sconv

YOLOv5改进系列(27)——添加SCConv注意力卷积(CVPR 2023|即插即用的高效卷积模块)

🚀 一、SCCONV介绍 

1.1 SCCONV简介 

传统的网络压缩模型的方法:

  • network pruning(网络剪枝)
  • weight quantization(权重量化)
  • low-rank factorization(低秩分解)
  • knowledge distillation(知识蒸馏)

不足:虽然这些方法能够达到减少参数的效果,但是往往都会导致模型性能的衰减

SCConv (spatial and channel reconstruction convolution),这是一个可以即插即用的,同时能够减少参数提升性能的模块。作者从空间通道的角度分别提出spatial reconstruction unit(SRU,空间重构单元)channel reconstruction unit(CRU,通道重构单元),核心思想是希望能够实现减少特征冗余从而提高算法的效率


1.2 SCCONV网络结构

(1)SCCONV总模块

通过上图我们可以看出,首先输入的特征图通过1x1的卷积降维,然后进入SCConv的核心部分,第一步将输入的特征 X 通过SRU得到空间细化的特征X^{w},再经过CRU 输出通道提炼的特征 Y,最后再通过1x1的卷积将特征通道数恢复并进行残差操作。


(2)SRU(空间重建单元)

SRU结构如上图所示,采用分离-重构的方法。

分离:目的是将信息量大的特征图从信息量小的特征图中分离出来,与空间内容相对应。作者利用组归一化(GN)层中的比例因子来评估不同特征图的信息内容。

0602b42a75d124eb2957739dfb1cd773.png

06d15d26a9ef46eca876b71e184cdef8.png

f4f4b9655bcfb3d57c5db937ff4043bc.png

其中,\mu\sigmaX的均值和标准差,\varepsilon是为了除法稳定性而加入的一个小的正常数,\gamma\beta是可训练的仿射变换。 

重构:目的是将信息量较多的特征和信息量较少的特征相加,生成信息量更多的特征并节省空间。采用交叉重构运算,将加权后的两个不同的信息特征充分结合起来,加强它们之间的信息流。然后将交叉重构的特征X_{1}^{w}X_{2}^{w}进行拼接,得到空间精细特征映射X^{w}

公式如下图所示:

3a8c2c2bb4656f0142d02ab2092e8326.png

其中是\bigotimes元素乘法,\bigoplus 是元素加法, \cup是求并集。

效果:经过 SRU 处理后,信息量大的特征从信息量小的特征中分离出来,减少了空间维度上的冗余特征。

代码实现
  1. class SRU(nn.Module):
  2. def __init__(self,
  3. oup_channels:int,
  4. group_num:int = 16,
  5. gate_treshold:float = 0.5
  6. ):
  7. super().__init__()
  8. self.gn = GroupBatchnorm2d( oup_channels, group_num = group_num )
  9. self.gate_treshold = gate_treshold
  10. self.sigomid = nn.Sigmoid()
  11. def forward(self,x):
  12. gn_x = self.gn(x)
  13. w_gamma = F.softmax(self.gn.gamma,dim=0)
  14. reweigts = self.sigomid( gn_x * w_gamma )
  15. # Gate
  16. info_mask = w_gamma>self.gate_treshold
  17. noninfo_mask= w_gamma<=self.gate_treshold
  18. x_1 = info_mask*reweigts * x
  19. x_2 = noninfo_mask*reweigts * x
  20. x = self.reconstruct(x_1,x_2)
  21. return x
  22. def reconstruct(self,x_1,x_2):
  23. x_11,x_12 = torch.split(x_1, x_1.size(1)//2, dim=1)
  24. x_21,x_22 = torch.split(x_2, x_2.size(1)//2, dim=1)
  25. return torch.cat([ x_11+x_22, x_12+x_21 ],dim=1)

(3)CRU 通道重建单元

CRU结构如上图所示,采用分割-转换-融合的方法。

分割:首先将输入的空间细化特征X^{w}分割成两个部分,一部分通道数是\alpha C,另一部分通道数是
(1-\alpha )C,随后对两组特征的通道数使用1 * 1卷积核进行压缩,分别得到X_{up}X_{low}

转换: 首先将输入的X_{up}作为“富特征提取”的输入,分别进行GWC和PWC,然后相加得到输出Y1,将输入X_{low}作为“富特征提取”的补充,进行PWC,得到的记过和原来的输入取并集得到Y_{2}

融合: 首先使用简化的SKNet方法来自适应合并Y_{1}Y_{2}。具体说是首先使用全局平均池化将全局空间信息和通道统计信息结合起来,得到经过池化德S1和S2。然后对S1和S2做Softmax得到特征权重向量\beta _{1}\beta _{2},最后使用特征权重向量得到输出

图片

Y即为通道提炼的特征。

代码实现
  1. class CRU(nn.Module):
  2. '''
  3. alpha: 0<alpha<1
  4. '''
  5. def __init__(self,
  6. op_channel: int,
  7. alpha: float = 1 / 2,
  8. squeeze_radio: int = 2,
  9. group_size: int = 2,
  10. group_kernel_size: int = 3,
  11. ):
  12. super().__init__()
  13. self.up_channel = up_channel = int(alpha * op_channel)
  14. self.low_channel = low_channel = op_channel - up_channel
  15. self.squeeze1 = nn.Conv2d(up_channel, up_channel // squeeze_radio, kernel_size=1, bias=False)
  16. self.squeeze2 = nn.Conv2d(low_channel, low_channel // squeeze_radio, kernel_size=1, bias=False)
  17. # up
  18. self.GWC = nn.Conv2d(up_channel // squeeze_radio, op_channel, kernel_size=group_kernel_size, stride=1,
  19. padding=group_kernel_size // 2, groups=group_size)
  20. self.PWC1 = nn.Conv2d(up_channel // squeeze_radio, op_channel, kernel_size=1, bias=False)
  21. # low
  22. self.PWC2 = nn.Conv2d(low_channel // squeeze_radio, op_channel - low_channel // squeeze_radio, kernel_size=1,
  23. bias=False)
  24. self.advavg = nn.AdaptiveAvgPool2d(1)
  25. def forward(self, x):
  26. # Split
  27. up, low = torch.split(x, [self.up_channel, self.low_channel], dim=1)
  28. up, low = self.squeeze1(up), self.squeeze2(low)
  29. # Transform
  30. Y1 = self.GWC(up) + self.PWC1(up)
  31. Y2 = torch.cat([self.PWC2(low), low], dim=1)
  32. # Fuse
  33. out = torch.cat([Y1, Y2], dim=1)
  34. out = F.softmax(self.advavg(out), dim=1) * out
  35. out1, out2 = torch.split(out, out.size(1) // 2, dim=1)
  36. return out1 + out2

🚀二、具体添加方法

2.1 添加顺序 

(1)models/common.py    -->  加入新增的网络结构

(2)     models/yolo.py       -->  设定网络结构的传参细节,将ScConv类名加入其中。(当新的自定义模块中存在输入输出维度时,要使用qw调整输出维度)
(3) models/yolov5*.yaml  -->  新建一个文件夹,如yolov5s_ScConv.yaml,修改现有模型结构配置文件。(当引入新的层时,要修改后续的结构中的from参数)
(4)         train.py                -->  修改‘--cfg’默认参数,训练时指定模型结构配置文件 


2.2 具体添加步骤 

第①步:在common.py中添加ScConv模块 

将下面的ScConv代码复制粘贴到common.py文件的末尾。

  1. # ScConv
  2. def autopad(k, p=None, d=1): # kernel, padding, dilation
  3. # Pad to 'same' shape outputs
  4. if d > 1:
  5. k = d * (k - 1) + 1 if isinstance(k, int) else [d * (x - 1) + 1 for x in k] # actual kernel-size
  6. if p is None:
  7. p = k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad
  8. return p
  9. class Conv(nn.Module):
  10. # Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation)
  11. default_act = nn.SiLU() # default activation
  12. def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):
  13. super().__init__()
  14. self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groups=g, dilation=d, bias=False)
  15. self.bn = nn.BatchNorm2d(c2)
  16. self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()
  17. def forward(self, x):
  18. return self.act(self.bn(self.conv(x)))
  19. def forward_fuse(self, x):
  20. return self.act(self.conv(x))
  21. class GroupBatchnorm2d(nn.Module):
  22. def __init__(self, c_num:int,
  23. group_num:int = 16,
  24. eps:float = 1e-10
  25. ):
  26. super(GroupBatchnorm2d,self).__init__()
  27. assert c_num >= group_num
  28. self.group_num = group_num
  29. self.gamma = nn.Parameter( torch.randn(c_num, 1, 1) )
  30. self.beta = nn.Parameter( torch.zeros(c_num, 1, 1) )
  31. self.eps = eps
  32. def forward(self, x):
  33. N, C, H, W = x.size()
  34. x = x.view( N, self.group_num, -1 )
  35. mean = x.mean( dim = 2, keepdim = True )
  36. std = x.std ( dim = 2, keepdim = True )
  37. x = (x - mean) / (std+self.eps)
  38. x = x.view(N, C, H, W)
  39. return x * self.gamma + self.beta
  40. class SRU(nn.Module):
  41. def __init__(self,
  42. oup_channels:int,
  43. group_num:int = 16,
  44. gate_treshold:float = 0.5
  45. ):
  46. super().__init__()
  47. self.gn = GroupBatchnorm2d( oup_channels, group_num = group_num )
  48. self.gate_treshold = gate_treshold
  49. self.sigomid = nn.Sigmoid()
  50. def forward(self,x):
  51. gn_x = self.gn(x)
  52. w_gamma = F.softmax(self.gn.gamma,dim=0)
  53. reweigts = self.sigomid( gn_x * w_gamma )
  54. # Gate
  55. info_mask = w_gamma>self.gate_treshold
  56. noninfo_mask= w_gamma<=self.gate_treshold
  57. x_1 = info_mask*reweigts * x
  58. x_2 = noninfo_mask*reweigts * x
  59. x = self.reconstruct(x_1,x_2)
  60. return x
  61. def reconstruct(self,x_1,x_2):
  62. x_11,x_12 = torch.split(x_1, x_1.size(1)//2, dim=1)
  63. x_21,x_22 = torch.split(x_2, x_2.size(1)//2, dim=1)
  64. return torch.cat([ x_11+x_22, x_12+x_21 ],dim=1)
  65. class CRU(nn.Module):
  66. '''
  67. alpha: 0<alpha<1
  68. '''
  69. def __init__(self,
  70. op_channel:int,
  71. alpha:float = 1/2,
  72. squeeze_radio:int = 2 ,
  73. group_size:int = 2,
  74. group_kernel_size:int = 3,
  75. ):
  76. super().__init__()
  77. self.up_channel = up_channel = int(alpha*op_channel)
  78. self.low_channel = low_channel = op_channel-up_channel
  79. self.squeeze1 = nn.Conv2d(up_channel,up_channel//squeeze_radio,kernel_size=1,bias=False)
  80. self.squeeze2 = nn.Conv2d(low_channel,low_channel//squeeze_radio,kernel_size=1,bias=False)
  81. #up
  82. self.GWC = nn.Conv2d(up_channel//squeeze_radio, op_channel,kernel_size=group_kernel_size, stride=1,padding=group_kernel_size//2, groups = group_size)
  83. self.PWC1 = nn.Conv2d(up_channel//squeeze_radio, op_channel,kernel_size=1, bias=False)
  84. #low
  85. self.PWC2 = nn.Conv2d(low_channel//squeeze_radio, op_channel-low_channel//squeeze_radio,kernel_size=1, bias=False)
  86. self.advavg = nn.AdaptiveAvgPool2d(1)
  87. def forward(self,x):
  88. # Split
  89. up,low = torch.split(x,[self.up_channel,self.low_channel],dim=1)
  90. up,low = self.squeeze1(up),self.squeeze2(low)
  91. # Transform
  92. Y1 = self.GWC(up) + self.PWC1(up)
  93. Y2 = torch.cat( [self.PWC2(low), low], dim= 1 )
  94. # Fuse
  95. out = torch.cat( [Y1,Y2], dim= 1 )
  96. out = F.softmax( self.advavg(out), dim=1 ) * out
  97. out1,out2 = torch.split(out,out.size(1)//2,dim=1)
  98. return out1+out2
  99. class ScConv(nn.Module):
  100. def __init__(self,
  101. op_channel:int,
  102. group_num:int = 16,
  103. gate_treshold:float = 0.5,
  104. alpha:float = 1/2,
  105. squeeze_radio:int = 2 ,
  106. group_size:int = 2,
  107. group_kernel_size:int = 3,
  108. ):
  109. super().__init__()
  110. self.SRU = SRU( op_channel,
  111. group_num = group_num,
  112. gate_treshold = gate_treshold )
  113. self.CRU = CRU( op_channel,
  114. alpha = alpha,
  115. squeeze_radio = squeeze_radio ,
  116. group_size = group_size ,
  117. group_kernel_size = group_kernel_size )
  118. def forward(self,x):
  119. x = self.SRU(x)
  120. x = self.CRU(x)
  121. return x
  122. class C3_ScConv(nn.Module):
  123. # CSP Bottleneck with 3 convolutions
  124. def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
  125. super().__init__()
  126. c_ = int(c2 * e) # hidden channels
  127. self.cv1 = Conv(c1, c_, 1, 1)
  128. self.cv2 = Conv(c1, c_, 1, 1)
  129. self.cv3 = Conv(2 * c_, c2, 1) # optional act=FReLU(c2)
  130. self.m = nn.Sequential(*(ScConv(c_) for _ in range(n)))
  131. def forward(self, x):
  132. return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1))
  133. if __name__ == '__main__':
  134. x = torch.randn(1,32,16,16)
  135. model = ScConv(32)
  136. print(model(x).shape)

第②步:修改yolo.py文件 

首先找到yolo.py里面parse_model函数的这一行

加入 ScConvC3_ScConv 这两个模块


 第③步:创建自定义的yaml文件   

 第1种,在SPPF前单独加一层

  1. # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
  2. # Parameters
  3. nc: 80 # number of classes
  4. depth_multiple: 0.33 # model depth multiple
  5. width_multiple: 1 # layer channel multiple
  6. anchors:
  7. - [10,13, 16,30, 33,23] # P3/8
  8. - [30,61, 62,45, 59,119] # P4/16
  9. - [116,90, 156,198, 373,326] # P5/32
  10. # YOLOv5 v6.0 backbone
  11. backbone:
  12. # [from, number, module, args]
  13. [[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
  14. [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
  15. [-1, 3, C3, [128]],
  16. [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
  17. [-1, 6, C3, [256]],
  18. [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
  19. [-1, 9, C3, [512]],
  20. [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
  21. [-1, 3, C3, [1024]],
  22. [-1, 3, ScConv, [1024]],
  23. [-1, 1, SPPF, [1024, 5]], # 9
  24. ]
  25. # YOLOv5 v6.0 head
  26. head:
  27. [[-1, 1, Conv, [512, 1, 1]],
  28. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
  29. [[-1, 6], 1, Concat, [1]], # cat backbone P4
  30. [-1, 3, C3, [512]], # 13
  31. [-1, 1, Conv, [256, 1, 1]],
  32. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
  33. [[-1, 4], 1, Concat, [1]], # cat backbone P3
  34. [-1, 3, C3, [256, False]], # 17 (P3/8-small)
  35. [-1, 1, Conv, [256, 3, 2]],
  36. [[-1, 14], 1, Concat, [1]], # cat head P4
  37. [-1, 3, C3, [512, False]], # 20 (P4/16-medium)
  38. [-1, 1, Conv, [512, 3, 2]],
  39. [[-1, 10], 1, Concat, [1]], # cat head P5
  40. [-1, 3, C3, [1024, False]], # 23 (P5/32-large)
  41. [[18, 21, 24], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
  42. ]

  第2种,替换conv结构

  1. # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
  2. # Parameters
  3. nc: 80 # number of classes
  4. depth_multiple: 0.33 # model depth multiple
  5. width_multiple: 1 # layer channel multiple
  6. anchors:
  7. - [10,13, 16,30, 33,23] # P3/8
  8. - [30,61, 62,45, 59,119] # P4/16
  9. - [116,90, 156,198, 373,326] # P5/32
  10. # YOLOv5 v6.0 backbone
  11. backbone:
  12. # [from, number, module, args]
  13. [[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
  14. [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
  15. [-1, 3, C3, [128]],
  16. [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
  17. [-1, 6, C3, [256]],
  18. [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
  19. [-1, 9, C3, [512]],
  20. [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
  21. [-1, 3, ScConv, [1024]],
  22. [-1, 1, SPPF, [1024, 5]], # 9
  23. ]
  24. # YOLOv5 v6.0 head
  25. head:
  26. [[-1, 1, Conv, [512, 1, 1]],
  27. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
  28. [[-1, 6], 1, Concat, [1]], # cat backbone P4
  29. [-1, 3, C3, [512]], # 13
  30. [-1, 1, Conv, [256, 1, 1]],
  31. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
  32. [[-1, 4], 1, Concat, [1]], # cat backbone P3
  33. [-1, 3, C3, [256, False]], # 17 (P3/8-small)
  34. [-1, 1, Conv, [256, 3, 2]],
  35. [[-1, 14], 1, Concat, [1]], # cat head P4
  36. [-1, 3, C3, [512, False]], # 20 (P4/16-medium)
  37. [-1, 1, Conv, [512, 3, 2]],
  38. [[-1, 10], 1, Concat, [1]], # cat head P5
  39. [-1, 3, C3, [1024, False]], # 23 (P5/32-large)
  40. [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
  41. ]

第3种,替换C3模块

  1. # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
  2. # Parameters
  3. nc: 80 # number of classes
  4. depth_multiple: 0.33 # model depth multiple
  5. width_multiple: 1 # layer channel multiple
  6. anchors:
  7. - [10,13, 16,30, 33,23] # P3/8
  8. - [30,61, 62,45, 59,119] # P4/16
  9. - [116,90, 156,198, 373,326] # P5/32
  10. # YOLOv5 v6.0 backbone
  11. backbone:
  12. # [from, number, module, args]
  13. [[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
  14. [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
  15. [-1, 3, C3_ScConv, [128]],
  16. [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
  17. [-1, 6, C3_ScConv, [256]],
  18. [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
  19. [-1, 9, C3_ScConv, [512]],
  20. [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
  21. [-1, 3, Conv, [1024]],
  22. [-1, 1, SPPF, [1024, 5]], # 9
  23. ]
  24. # YOLOv5 v6.0 head
  25. head:
  26. [[-1, 1, Conv, [512, 1, 1]],
  27. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
  28. [[-1, 6], 1, Concat, [1]], # cat backbone P4
  29. [-1, 3, C3_ScConv, [512]], # 13
  30. [-1, 1, Conv, [256, 1, 1]],
  31. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
  32. [[-1, 4], 1, Concat, [1]], # cat backbone P3
  33. [-1, 3, C3_ScConv, [256, False]], # 17 (P3/8-small)
  34. [-1, 1, Conv, [256, 3, 2]],
  35. [[-1, 14], 1, Concat, [1]], # cat head P4
  36. [-1, 3, C3_ScConv, [512, False]], # 20 (P4/16-medium)
  37. [-1, 1, Conv, [512, 3, 2]],
  38. [[-1, 10], 1, Concat, [1]], # cat head P5
  39. [-1, 3, C3_ScConv, [1024, False]], # 23 (P5/32-large)
  40. [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
  41. ]

第④步:验证是否加入成功

运行yolo.py

第1种

第2种 

第3种

这样就OK啦!