YOLOv5改进系列（10）——替换主干网络之GhostNet

🚀 一、GhostNet介绍

1.1 简介

GhostNet 相比于普通的卷积神经网络在生成特征图时使用了更少的参数。它提出的动机是为了改善神经网络中特征图存在着冗余的现象。神经网络中的特征图存在着一定程度上的冗余，这些冗余的特征图一定程度上来说，也增强了网络对特征理解的能力，对于一个成功的模型来说这些冗余的特征图也是必不可少的。相比于有些轻量化网络去除掉这些冗余的特征图，GhostNet 选择低成本的办法来保留它们。

在这里插入图片描述

1.2 基本单元

虽然 Shufflenet 和 Mobilenet 为了减少参数量使用了 1*1 的逐点卷积方式，但是 Ghonstnet 的作者认为 1*1 卷积还是会产生一定的计算量，并且发现许多的卷积神经网络并没有考虑到经过多次卷积后会存在特征冗余的现象。为了解决上述两个问题，作者提出了 Ghost 基本单元。

Ghost 基本单元采用了一系列的线性变换来生成特征图而不是采用卷积的方式生成特征图，这样可以减少网络的计算量。

图 a 为传统的卷积生成特征图的方式，图 b 为 Ghost 模块产生特征图的方式。

如图 b 所示，假设输入特征图的 shape 为 [5,5,6]，首先对输入特征图使用 1*1 卷积下降通道数，shape 变为 [5,5,3]；再使用 3*3 深度卷积对每个通道特征图提取特征，shape 为 [5,5,3]，可以看作是经过前一层的一系列线性变换得到的；最后将两次卷积的输出特征图在通道维度上堆叠，shape 变为 [5,5,6]

GhostNet 模块在计算复杂度低，参数量少的情况下生成了和标准卷积一样大小的特征图。

1.3 网络结构

整个Ghostnet都是由Ghost Bottlenecks进行组成的。

当一张图片输入到Ghostnet当中时：

（1）首先进入一个16通道的普通1x1卷积块（卷积+标准化+激活函数）。

（2）之后就开始Ghost Bottlenecks的堆叠了，利用Ghost Bottlenecks，最终获得了一个7x7x160的特征层（当输入是224x224x3的时候）。

（3）然后利用一个1x1的卷积块进行通道数的调整，此时可以获得一个7x7x960的特征层。

（4）接着进行一次全局平均池化，然后再利用一个1x1的卷积块进行通道数的调整，获得一个1x1x1280的特征层。

（5）最后平铺后进行全连接就可以进行分类了。

🚀 二、YOLOv5结合GhostNet

2.1 添加顺序

之前在讲添加注意力机制时我们就介绍过改进网络的顺序，替换主干网络也是大同小异的。

（1）models/common.py --> 加入新增的网络结构

（2） models/yolo.py --> 设定网络结构的传参细节，将GhostNet类名加入其中。（当新的自定义模块中存在输入输出维度时，要使用qw调整输出维度）

（3） models/yolov5*.yaml --> 修改现有模型结构配置文件

当引入新的层时，要修改后续的结构中的from参数
当仅替换主千网络时，要注意特征图的变换，/8，/16，/32

（4） train.py --> 修改‘--cfg’默认参数，训练时指定模型结构配置文件

2.2 具体添加步骤

第①步：在common.py中添加GhostNet模块

这次比较特殊，因为在最新版本的YOLOv5-6.1源码中，作者已经加入了Ghost模块，在models/common.py 文件下

（就在Focus类的下面）

# Ghost
class SeBlock(nn.Module):
    def __init__(self, in_channel, reduction=4):
        super().__init__()
        self.Squeeze = nn.AdaptiveAvgPool2d(1)
        self.Excitation = nn.Sequential()
        self.Excitation.add_module('FC1', nn.Conv2d(in_channel, in_channel // reduction, kernel_size=1))  # 1*1卷积与此效果相同
        self.Excitation.add_module('ReLU', nn.ReLU())
        self.Excitation.add_module('FC2', nn.Conv2d(in_channel // reduction, in_channel, kernel_size=1))
        self.Excitation.add_module('Sigmoid', nn.Sigmoid())
    def forward(self, x):
        y = self.Squeeze(x)
        ouput = self.Excitation(y)
        return x * (ouput.expand_as(x))
class G_bneck(nn.Module):
    # Ghost Bottleneck https://github.com/huawei-noah/ghostnet
    def __init__(self, c1, c2, midc, k=5, s=1, use_se = False):  # ch_in, ch_mid, ch_out, kernel, stride, use_se
        super().__init__()
        assert s in [1, 2]
        c_ = midc
        self.conv = nn.Sequential(GhostConv(c1, c_, 1, 1),              # Expansion
                                  Conv(c_, c_, 3, s=2, p=1, g=c_, act=False) if s == 2 else nn.Identity(),  # dw
                                  # Squeeze-and-Excite
                                  SeBlock(c_) if use_se else nn.Sequential(),
                                  GhostConv(c_, c2, 1, 1, act=False))   # Squeeze pw-linear
        self.shortcut = nn.Identity() if (c1 == c2 and s == 1) else \
                                                nn.Sequential(Conv(c1, c1, 3, s=s, p=1, g=c1, act=False), \
                                                Conv(c1, c2, 1, 1, act=False)) # 避免stride=2时 通道数改变的情况
    def forward(self, x):
        # print(self.conv(x).shape)
        # print(self.shortcut(x).shape)
        return self.conv(x) + self.shortcut(x)

如下图所示：

第②步：在yolo.py文件里的parse_model函数加入类名

首先找到yolo.py里面parse_model函数的这一行

加入 G_bneck 这个模块

第③步：创建自定义的yaml文件

同样的，在models/hub/文件夹下，给出了yolo5s-ghost.yaml文件，因此我们直接使用即可

（你以为这篇文章就要这么水过去了吗✧ (≖ ‿ ≖)✧。。。

当然不可能啦！:.ﾟヽ(｡◕‿◕｡)ﾉﾟ.:｡+ﾟ）

参考了大佬的代码

接下来我们说一下yolo5l_GhostNet.yaml 的写法

首先在models文件夹下复制yolov5l.yaml 文件，粘贴并重命名为 yolo5l_GhostNet.yaml

然后根据GhostNet的网络结构来修改配置文件。

完整代码如下：

# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32
# Ghostnet backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [16, 3, 2, 1]],            # 0-P1/2  ch_out, kernel, stride, padding, groups 224*224*3
   [-1, 1, G_bneck, [16, 16, 3, 1]],        # 1  ch_out, ch_mid, dw-kernel, stride 112*112*16
   [-1, 1, G_bneck, [24, 48, 3, 2]],        # 2-P2/4   112*112*16
   [-1, 1, G_bneck, [24, 72, 3, 1]],        # 3         56*56*24
   [-1, 1, G_bneck, [40, 72, 3, 2, True]],  # 4-P3/8    56*56*24
   [-1, 1, G_bneck, [40, 120, 3, 1, True]], # 5         28*28*40
   [-1, 1, G_bneck, [80, 240, 3, 2]],        # 6-P4/16  28*28*40
   [-1, 3, G_bneck, [80, 184, 3, 1]],        # 7        14*14*80
   [-1, 1, G_bneck, [112, 480, 3, 1, True]], # 8        14*14*80
   [-1, 1, G_bneck, [112, 480, 3, 1, True]], # 9        14*14*80
   [-1, 1, G_bneck, [160, 672, 3, 2, True]], # 10-P5/32 14*14*112
   [-1, 1, G_bneck, [160, 960, 3, 1]],       # 11        7*7*160
   [-1, 1, G_bneck, [160, 960, 3, 1, True]], # 12        7*7*160
   [-1, 1, G_bneck, [160, 960, 3, 1]],       # 13        7*7*160
   [-1, 1, G_bneck, [160, 960, 3, 1, True]], # 14        7*7*160
   [-1, 1, Conv, [960]],                     # 15        7*7*160
  ]
# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [512, 1, 1]], # 16
   [-1, 1, nn.Upsample, [None, 2, 'nearest']], # 17
   [[-1, 9], 1, Concat, [1]],  # 18 cat backbone P4
   [-1, 3, C3, [512, False]],  # 19
   [-1, 1, Conv, [256, 1, 1]], # 20
   [-1, 1, nn.Upsample, [None, 2, 'nearest']], # 21
   [[-1, 5], 1, Concat, [1]],  # 22 cat backbone P3
   [-1, 3, C3, [256, False]],  # 23 (P3/8-small)
   [-1, 1, Conv, [256, 3, 2]], # 24
   [[-1, 20], 1, Concat, [1]], # 25 cat head P4
   [-1, 3, C3, [512, False]],  # 26 (P4/16-medium)
   [-1, 1, Conv, [512, 3, 2]],  # 27
   [[-1, 15], 1, Concat, [1]],  # 28 cat head P5
   [-1, 3, C3, [1024, False]],  # 29 (P5/32-large)
   [[23, 26, 29], 1, Detect, [nc, anchors]],  # 30 Detect(P3, P4, P5)
  ]