学习资源站

RT-DETR改进策略【注意力机制篇】2024SCITOPFCAttention即插即用注意力模块,增强局部和全局特征信息交互-

RT-DETR改进策略【注意力机制篇】| 2024 SCI TOP FCAttention 即插即用注意力模块,增强局部和全局特征信息交互

一、本文介绍

本文记录的是 基于FCAttention模块的RT-DETR目标检测改进方法研究 FCAttention 是图像去雾领域新提出的模块能够 有效整合全局和局部信息、合理分配权重的通道注意力机制,使得网络能够更准确地强调有用特征,抑制不太有用的特征 ,在目标检测领域中同样有效。



二、FCA原理

用于图像去雾的无监督双向对比重建和自适应细粒度通道注意网络

FCA(Adaptive Fine - Grained Channel Attention) 模块设计的原理及优势如下:

2.1 原理

  • 特征图处理 :首先,对包含全局空间信息的特征图F进行全局平均池化,将其转换为通道描述符U,用于获取通道信息。具体公式为: U n = G A P ( F n ) = 1 H × W ∑ i = 1 H ∑ j = 1 W F n ( i , j ) U_{n}=GAP(F_{n})=\frac{1}{H×W}\sum_{i=1}^{H}\sum_{j=1}^{W}F_{n}(i, j) U n = G A P ( F n ) = H × W 1 i = 1 H j = 1 W F n ( i , j ) ,其中 F ∈ R C × H × W F \in \mathbb{R}^{C×H×W} F R C × H × W C C C H H H W W W 分别代表通道数、长度和宽度, U ∈ R C U \in \mathbb{R}^{C} U R C G A P ( x ) GAP(x) G A P ( x ) 为全局平均 pooling 函数。
  • 局部信息获取 :为了在获取少量模型参数的同时获得局部通道信息,使用带矩阵B进行局部通道交互,设置 B = [ b 1 , b 2 , b 3 , . . . , b k ] B=[b_{1}, b_{2}, b_{3},..., b_{k}] B = [ b 1 , b 2 , b 3 , ... , b k ] ,通过 U l c = ∑ i = 1 k U ⋅ b i U_{lc}=\sum_{i=1}^{k}U\cdot b_{i} U l c = i = 1 k U b i 计算局部信息 U l c U_{lc} U l c ,其中 U U U 为通道描述符, k k k 为相邻通道数。
  • 全局信息获取 :利用对角矩阵D捕获所有通道之间的依赖关系作为全局信息,设置 D = [ d 1 , d 2 , d 3 , . . . , d c ] D=[d_{1}, d_{2}, d_{3},..., d_{c}] D = [ d 1 , d 2 , d 3 , ... , d c ] ,通过 U g c = ∑ i = 1 c U ⋅ d i U_{gc}=\sum_{i=1}^{c}U\cdot d_{i} U g c = i = 1 c U d i 计算全局信息 U g c U_{gc} U g c ,其中 c c c 为通道数。
  • 相关性捕获 :通过交叉相关操作将全局信息 U g c U_{gc} U g c 与局部信息 U l c U_{lc} U l c 相结合,得到相关性矩阵 M = U g c ⋅ U l c T M = U_{gc}\cdot U_{lc}^{T} M = U g c U l c T ,以捕获两者在不同粒度上的相关性。
  • 自适应融合 :从相关性矩阵及其转置中提取行和列信息作为全局和局部信息的权重向量,通过可学习因子实现动态融合。具体公式为: U g c w = ∑ j c M i , j , i ∈ 1 , 2 , 3... c U_{gc}^{w}=\sum_{j}^{c}M_{i, j}, i \in 1,2,3...c U g c w = j c M i , j , i 1 , 2 , 3... c U l c w = ∑ j c ( U l c ⋅ U g c T ) i , j = ∑ j c M i , j T , i ∈ 1 , 2 , 3... c U_{lc}^{w}=\sum_{j}^{c}(U_{lc}\cdot U_{gc}^{T})_{i, j}=\sum_{j}^{c}M^{T}_{i, j}, i \in 1,2,3...c U l c w = j c ( U l c U g c T ) i , j = j c M i , j T , i 1 , 2 , 3... c W = σ ( σ ( θ ) × σ ( U g c w ) + ( 1 − σ ( θ ) ) × σ ( U l c w ) ) W=\sigma(\sigma(\theta)×\sigma(U_{gc}^{w})+(1 - \sigma(\theta))×\sigma(U_{lc}^{w})) W = σ ( σ ( θ ) × σ ( U g c w ) + ( 1 σ ( θ )) × σ ( U l c w )) ,其中 KaTeX parse error: Can't use function '\)' in math mode at position 11: U_{gc}^{w}\̲)̲和\(U_{lc}^{w} 为融合后的全局和局部通道权重, θ \theta θ 表示sigmoid激活函数。
  • 权重应用 :将得到的权重与输入特征图相乘,得到最终输出特征图,即 τ ∗ = W ⊗ F \tau^{*}=W \otimes F τ = W F ,其中 F F F 为输入特征图, F ∗ F^{*} F 为最终输出特征图。

在这里插入图片描述

2.2 优势

  • 有效整合信息 :能够有效整合全局和局部信息,通过相关性矩阵捕获两者在不同粒度上的相关性,促进了全局和局部信息的有效交互。
  • 合理分配权重 :采用自适应融合策略,避免了局部和全局信息之间冗余的交叉相关操作,进一步促进了它们的交互,能够更精确地为去雾相关特征分配权重。
  • 提升去雾性能 :在网络去雾过程中,充分利用全局和局部通道信息,提高了网络去雾性能,使得网络能够更准确地强调有用特征,抑制不太有用的特征。

论文: https://doi.org/10.1016/j.neunet.2024.106314
源码: https://github.com/Lose-Code/UBRFC-Net

三、FCAttention的实现代码

FCAttention模块 的实现代码如下:

import math
import torch
import torch.nn as nn

from ultralytics.nn.modules.conv import LightConv

class Mix(nn.Module):
    def __init__(self, m=-0.80):
        super(Mix, self).__init__()
        w = torch.nn.Parameter(torch.FloatTensor([m]), requires_grad=True)
        w = torch.nn.Parameter(w, requires_grad=True)
        self.w = w
        self.mix_block = nn.Sigmoid()

    def forward(self, fea1, fea2):
        mix_factor = self.mix_block(self.w)
        out = fea1 * mix_factor.expand_as(fea1) + fea2 * (1 - mix_factor.expand_as(fea2))
        return out

class FCAttention(nn.Module):
    def __init__(self,channel,b=1, gamma=2):
        super(FCAttention, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)#全局平均池化
        #一维卷积
        t = int(abs((math.log(channel, 2) + b) / gamma))
        k = t if t % 2 else t + 1
        self.conv1 = nn.Conv1d(1, 1, kernel_size=k, padding=int(k / 2), bias=False)
        self.fc = nn.Conv2d(channel, channel, 1, padding=0, bias=True)
        self.sigmoid = nn.Sigmoid()
        self.mix = Mix()

    def forward(self, input):
        x = self.avg_pool(input)
        x1 = self.conv1(x.squeeze(-1).transpose(-1, -2)).transpose(-1, -2)#(1,64,1)
        x2 = self.fc(x).squeeze(-1).transpose(-1, -2)#(1,1,64)
        out1 = torch.sum(torch.matmul(x1,x2),dim=1).unsqueeze(-1).unsqueeze(-1)#(1,64,1,1)
        #x1 = x1.transpose(-1, -2).unsqueeze(-1)
        out1 = self.sigmoid(out1)
        out2 = torch.sum(torch.matmul(x2.transpose(-1, -2),x1.transpose(-1, -2)),dim=1).unsqueeze(-1).unsqueeze(-1)

        #out2 = self.fc(x)
        out2 = self.sigmoid(out2)
        out = self.mix(out1,out2)
        out = self.conv1(out.squeeze(-1).transpose(-1, -2)).transpose(-1, -2).unsqueeze(-1)
        out = self.sigmoid(out)

        return input*out
def autopad(k, p=None, d=1):  # kernel, padding, dilation
    """Pad to 'same' shape outputs."""
    if d > 1:
        k = d * (k - 1) + 1 if isinstance(k, int) else [d * (x - 1) + 1 for x in k]  # actual kernel-size
    if p is None:
        p = k // 2 if isinstance(k, int) else [x // 2 for x in k]  # auto-pad
    return p

class Conv(nn.Module):
    """Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation)."""
 
    default_act = nn.SiLU()  # default activation
 
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):
        """Initialize Conv layer with given arguments including activation."""
        super().__init__()
        self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groups=g, dilation=d, bias=False)
        self.bn = nn.BatchNorm2d(c2)
        self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()
 
    def forward(self, x):
        """Apply convolution, batch normalization and activation to input tensor."""
        return self.act(self.bn(self.conv(x)))
 
    def forward_fuse(self, x):
        """Perform transposed convolution of 2D data."""
        return self.act(self.conv(x))

class HGBlock_FCAttention(nn.Module):
    """
    HG_Block of PPHGNetV2 with 2 convolutions and LightConv.

    https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/modeling/backbones/hgnet_v2.py
    """

    def __init__(self, c1, cm, c2, k=3, n=6, lightconv=False, shortcut=False, act=nn.ReLU()):
        """Initializes a CSP Bottleneck with 1 convolution using specified input and output channels."""
        super().__init__()
        block = LightConv if lightconv else Conv
        self.m = nn.ModuleList(block(c1 if i == 0 else cm, cm, k=k, act=act) for i in range(n))
        self.sc = Conv(c1 + n * cm, c2 // 2, 1, 1, act=act)  # squeeze conv
        self.ec = Conv(c2 // 2, c2, 1, 1, act=act)  # excitation conv
        self.add = shortcut and c1 == c2
        self.cv = FCAttention(c2)
        
    def forward(self, x):
        """Forward pass of a PPHGNetV2 backbone layer."""
        y = [x]
        y.extend(m(y[-1]) for m in self.m)
        y = self.cv(self.ec(self.sc(torch.cat(y, 1))))
        return y + x if self.add else y

四、创新模块

4.1 改进点1

模块改进方法 1️⃣:直接加入 FCAttention模块
FCAttention模块 添加后如下:

在这里插入图片描述

注意❗:需要声明的模块名称为: FCAttention

4.2 改进点2⭐

模块改进方法 2️⃣:基于 FCAttention模块 HGBlock

第二种改进方法是对 RT-DETR 中的 HGBlock模块 进行改进,将 FCAttention注意力模块 加入后,能够充分利用全局和局部通道信息, 使得网络能够更准确地强调有用特征,抑制不太有用的特征。并且其中的自适应融合策略,避免了局部和全局信息之间冗余的交叉相关操作,进一步促进了它们的交互,能够更精确地进行特征分配权重。

改进代码如下:

class HGBlock_FCAttention(nn.Module):
    """
    HG_Block of PPHGNetV2 with 2 convolutions and LightConv.

    https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/modeling/backbones/hgnet_v2.py
    """

    def __init__(self, c1, cm, c2, k=3, n=6, lightconv=False, shortcut=False, act=nn.ReLU()):
        """Initializes a CSP Bottleneck with 1 convolution using specified input and output channels."""
        super().__init__()
        block = LightConv if lightconv else Conv
        self.m = nn.ModuleList(block(c1 if i == 0 else cm, cm, k=k, act=act) for i in range(n))
        self.sc = Conv(c1 + n * cm, c2 // 2, 1, 1, act=act)  # squeeze conv
        self.ec = Conv(c2 // 2, c2, 1, 1, act=act)  # excitation conv
        self.add = shortcut and c1 == c2
        self.cv = FCAttention(c2)
        
    def forward(self, x):
        """Forward pass of a PPHGNetV2 backbone layer."""
        y = [x]
        y.extend(m(y[-1]) for m in self.m)
        y = self.cv(self.ec(self.sc(torch.cat(y, 1))))
        return y + x if self.add else y

在这里插入图片描述

注意❗:需要声明的模块名称为: HGBlock_FCAttention


五、添加步骤

5.1 修改一

① 在 ultralytics/nn/ 目录下新建 AddModules 文件夹用于存放模块代码

② 在 AddModules 文件夹下新建 FCAttention.py ,将 第三节 中的代码粘贴到此处

在这里插入图片描述

5.2 修改二

AddModules 文件夹下新建 __init__.py (已有则不用新建),在文件内导入模块: from .FCAttention import *

在这里插入图片描述

5.3 修改三

ultralytics/nn/modules/tasks.py 文件中,需要在两处位置添加各模块类名称。

首先:导入模块

在这里插入图片描述

其次:在 parse_model函数 中注册 FCAttention HGBlock_FCAttention 模块

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述


六、yaml模型文件

6.1 模型改进版本一

在代码配置完成后,配置模型的YAML文件。

此处以 ultralytics/cfg/models/rt-detr/rtdetr-l.yaml 为例,在同目录下创建一个用于自己数据集训练的模型文件 rtdetr-l-FCAttention.yaml

rtdetr-l.yaml 中的内容复制到 rtdetr-l-FCAttention.yaml 文件下,修改 nc 数量等于自己数据中目标的数量。
在骨干网络的深层添加 FCAttention模块 只需要填入一个参数,通道数

# Ultralytics YOLO 🚀, AGPL-3.0 license
# RT-DETR-l object detection model with P3-P5 outputs. For details see https://docs.ultralytics.com/models/rtdetr

# Parameters
nc: 1 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n-cls.yaml' will call yolov8-cls.yaml with scale 'n'
  # [depth, width, max_channels]
  l: [1.00, 1.00, 1024]

backbone:
  # [from, repeats, module, args]
  - [-1, 1, HGStem, [32, 48]] # 0-P2/4
  - [-1, 6, HGBlock, [48, 128, 3]] # stage 1

  - [-1, 1, DWConv, [128, 3, 2, 1, False]] # 2-P3/8
  - [-1, 6, HGBlock, [96, 512, 3]] # stage 2

  - [-1, 1, DWConv, [512, 3, 2, 1, False]] # 4-P4/16
  - [-1, 6, HGBlock, [192, 1024, 5, True, False]] # cm, c2, k, light, shortcut
  - [-1, 6, HGBlock, [192, 1024, 5, True, True]]
  - [-1, 6, HGBlock, [192, 1024, 5, True, True]] # stage 3

  - [-1, 1, DWConv, [1024, 3, 2, 1, False]] # 8-P5/32
  - [-1, 1, FCAttention, [1024]] # stage 4
  - [-1, 6, HGBlock, [384, 2048, 5, True, False]] # stage 4

head:
  - [-1, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 10 input_proj.2
  - [-1, 1, AIFI, [1024, 8]]
  - [-1, 1, Conv, [256, 1, 1]] # 12, Y5, lateral_convs.0

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [7, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 14 input_proj.1
  - [[-2, -1], 1, Concat, [1]]
  - [-1, 3, RepC3, [256]] # 16, fpn_blocks.0
  - [-1, 1, Conv, [256, 1, 1]] # 17, Y4, lateral_convs.1

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [3, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 19 input_proj.0
  - [[-2, -1], 1, Concat, [1]] # cat backbone P4
  - [-1, 3, RepC3, [256]] # X3 (21), fpn_blocks.1

  - [-1, 1, Conv, [256, 3, 2]] # 22, downsample_convs.0
  - [[-1, 18], 1, Concat, [1]] # cat Y4
  - [-1, 3, RepC3, [256]] # F4 (24), pan_blocks.0

  - [-1, 1, Conv, [256, 3, 2]] # 25, downsample_convs.1
  - [[-1, 13], 1, Concat, [1]] # cat Y5
  - [-1, 3, RepC3, [256]] # F5 (27), pan_blocks.1

  - [[22, 25, 28], 1, RTDETRDecoder, [nc]] # Detect(P3, P4, P5)

6.2 模型改进版本二⭐

此处同样以 ultralytics/cfg/models/rt-detr/rtdetr-l.yaml 为例,在同目录下创建一个用于自己数据集训练的模型文件 rtdetr-l-HGBlock_FCAttention.yaml

rtdetr-l.yaml 中的内容复制到 rtdetr-l-HGBlock_FCAttention.yaml 文件下,修改 nc 数量等于自己数据中目标的数量。

📌 模型的修改方法是将 骨干网络 中的部分 HGBlock模块 替换成 HGBlock_FCAttention模块

# Ultralytics YOLO 🚀, AGPL-3.0 license
# RT-DETR-l object detection model with P3-P5 outputs. For details see https://docs.ultralytics.com/models/rtdetr

# Parameters
nc: 1 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n-cls.yaml' will call yolov8-cls.yaml with scale 'n'
  # [depth, width, max_channels]
  l: [1.00, 1.00, 1024]

backbone:
  # [from, repeats, module, args]
  - [-1, 1, HGStem, [32, 48]] # 0-P2/4
  - [-1, 6, HGBlock, [48, 128, 3]] # stage 1

  - [-1, 1, DWConv, [128, 3, 2, 1, False]] # 2-P3/8
  - [-1, 6, HGBlock, [96, 512, 3]] # stage 2

  - [-1, 1, DWConv, [512, 3, 2, 1, False]] # 4-P4/16
  - [-1, 6, HGBlock_FCAttention, [192, 1024, 5, True, False]] # cm, c2, k, light, shortcut
  - [-1, 6, HGBlock_FCAttention, [192, 1024, 5, True, True]]
  - [-1, 6, HGBlock_FCAttention, [192, 1024, 5, True, True]] # stage 3

  - [-1, 1, DWConv, [1024, 3, 2, 1, False]] # 8-P5/32
  - [-1, 6, HGBlock, [384, 2048, 5, True, False]] # stage 4

head:
  - [-1, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 10 input_proj.2
  - [-1, 1, AIFI, [1024, 8]]
  - [-1, 1, Conv, [256, 1, 1]] # 12, Y5, lateral_convs.0

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [7, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 14 input_proj.1
  - [[-2, -1], 1, Concat, [1]]
  - [-1, 3, RepC3, [256]] # 16, fpn_blocks.0
  - [-1, 1, Conv, [256, 1, 1]] # 17, Y4, lateral_convs.1

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [3, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 19 input_proj.0
  - [[-2, -1], 1, Concat, [1]] # cat backbone P4
  - [-1, 3, RepC3, [256]] # X3 (21), fpn_blocks.1

  - [-1, 1, Conv, [256, 3, 2]] # 22, downsample_convs.0
  - [[-1, 17], 1, Concat, [1]] # cat Y4
  - [-1, 3, RepC3, [256]] # F4 (24), pan_blocks.0

  - [-1, 1, Conv, [256, 3, 2]] # 25, downsample_convs.1
  - [[-1, 12], 1, Concat, [1]] # cat Y5
  - [-1, 3, RepC3, [256]] # F5 (27), pan_blocks.1

  - [[21, 24, 27], 1, RTDETRDecoder, [nc]] # Detect(P3, P4, P5)


七、成功运行结果

分别打印网络模型可以看到 FCAttention 模块 HGBlock_FCAttention 已经加入到模型中,并可以进行训练了。

rtdetr-l-FCAttention

rtdetr-l-FCAttention summary: 688 layers, 33,858,249 parameters, 33,858,249 gradients, 108.0 GFLOPs

                   from  n    params  module                                       arguments                     
  0                  -1  1     25248  ultralytics.nn.modules.block.HGStem          [3, 32, 48]                   
  1                  -1  6    155072  ultralytics.nn.modules.block.HGBlock         [48, 48, 128, 3, 6]           
  2                  -1  1      1408  ultralytics.nn.modules.conv.DWConv           [128, 128, 3, 2, 1, False]    
  3                  -1  6    839296  ultralytics.nn.modules.block.HGBlock         [128, 96, 512, 3, 6]          
  4                  -1  1      5632  ultralytics.nn.modules.conv.DWConv           [512, 512, 3, 2, 1, False]    
  5                  -1  6   1695360  ultralytics.nn.modules.block.HGBlock         [512, 192, 1024, 5, 6, True, False]
  6                  -1  6   2055808  ultralytics.nn.modules.block.HGBlock         [1024, 192, 1024, 5, 6, True, True]
  7                  -1  6   2055808  ultralytics.nn.modules.block.HGBlock         [1024, 192, 1024, 5, 6, True, True]
  8                  -1  1     11264  ultralytics.nn.modules.conv.DWConv           [1024, 1024, 3, 2, 1, False]  
  9                  -1  1   1050118  ultralytics.nn.AddModules.FCAttention.FCAttention[1024, 1024]                  
 10                  -1  6   6708480  ultralytics.nn.modules.block.HGBlock         [1024, 384, 2048, 5, 6, True, False]
 11                  -1  1    524800  ultralytics.nn.modules.conv.Conv             [2048, 256, 1, 1, None, 1, 1, False]
 12                  -1  1    789760  ultralytics.nn.modules.transformer.AIFI      [256, 1024, 8]                
 13                  -1  1     66048  ultralytics.nn.modules.conv.Conv             [256, 256, 1, 1]              
 14                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 15                   7  1    262656  ultralytics.nn.modules.conv.Conv             [1024, 256, 1, 1, None, 1, 1, False]
 16            [-2, -1]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 17                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 18                  -1  1     66048  ultralytics.nn.modules.conv.Conv             [256, 256, 1, 1]              
 19                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 20                   3  1    131584  ultralytics.nn.modules.conv.Conv             [512, 256, 1, 1, None, 1, 1, False]
 21            [-2, -1]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 22                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 23                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
 24            [-1, 18]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 25                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 26                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
 27            [-1, 13]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 28                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 29        [22, 25, 28]  1   7303907  ultralytics.nn.modules.head.RTDETRDecoder    [1, [256, 256, 256]]          
rtdetr-l-FCAttention summary: 688 layers, 33,858,249 parameters, 33,858,249 gradients, 108.0 GFLOPs

rtdetr-l-HGBlock_FCAttention

rtdetr-l-HGBlock_FCAttention summary: 703 layers, 35,956,949 parameters, 35,956,949 gradients, 108.0 GFLOPs

                   from  n    params  module                                       arguments                     
  0                  -1  1     25248  ultralytics.nn.modules.block.HGStem          [3, 32, 48]                   
  1                  -1  6    155072  ultralytics.nn.modules.block.HGBlock         [48, 48, 128, 3, 6]           
  2                  -1  1      1408  ultralytics.nn.modules.conv.DWConv           [128, 128, 3, 2, 1, False]    
  3                  -1  6    839296  ultralytics.nn.modules.block.HGBlock         [128, 96, 512, 3, 6]          
  4                  -1  1      5632  ultralytics.nn.modules.conv.DWConv           [512, 512, 3, 2, 1, False]    
  5                  -1  6   2744966  ultralytics.nn.AddModules.FCAttention.HGBlock_FCAttention[512, 192, 1024, 5, 6, True, False]
  6                  -1  6   3105414  ultralytics.nn.AddModules.FCAttention.HGBlock_FCAttention[1024, 192, 1024, 5, 6, True, True]
  7                  -1  6   3105414  ultralytics.nn.AddModules.FCAttention.HGBlock_FCAttention[1024, 192, 1024, 5, 6, True, True]
  8                  -1  1     11264  ultralytics.nn.modules.conv.DWConv           [1024, 1024, 3, 2, 1, False]  
  9                  -1  6   6708480  ultralytics.nn.modules.block.HGBlock         [1024, 384, 2048, 5, 6, True, False]
 10                  -1  1    524800  ultralytics.nn.modules.conv.Conv             [2048, 256, 1, 1, None, 1, 1, False]
 11                  -1  1    789760  ultralytics.nn.modules.transformer.AIFI      [256, 1024, 8]                
 12                  -1  1     66048  ultralytics.nn.modules.conv.Conv             [256, 256, 1, 1]              
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 14                   7  1    262656  ultralytics.nn.modules.conv.Conv             [1024, 256, 1, 1, None, 1, 1, False]
 15            [-2, -1]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 16                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 17                  -1  1     66048  ultralytics.nn.modules.conv.Conv             [256, 256, 1, 1]              
 18                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 19                   3  1    131584  ultralytics.nn.modules.conv.Conv             [512, 256, 1, 1, None, 1, 1, False]
 20            [-2, -1]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 21                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 22                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
 23            [-1, 17]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 24                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 25                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
 26            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 27                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 28        [21, 24, 27]  1   7303907  ultralytics.nn.modules.head.RTDETRDecoder    [1, [256, 256, 256]]          
rtdetr-l-HGBlock_FCAttention summary: 703 layers, 35,956,949 parameters, 35,956,949 gradients, 108.0 GFLOPs