学习资源站

RT-DETR改进策略【卷积层】ICCV-2023LSK大核选择模块包含ResNetLayer二次独家创新-

RT-DETR改进策略【卷积层】| ICCV-2023 LSK大核选择模块 包含ResNetLayer二次独家创新

一、本文介绍

本文记录的是 利用 大核选择模块LSK 优化 RT-DETR 的目标检测网络模型 。在大尺寸图像中的小目标检测任务中,一直是个难题,无法仅基于外观实现较好的识别,因此需要广泛的 上下文信息进行辅助 。但不同物体所需的上下文信息范围不同,为了更好地对这些特性进行建模,本文利用大核选择模块 二次创新ResNetLayer 使模型能够产生具有各种大感受野的多个特征的同时,动态地根据输入调整模型的行为,使网络更好地适应图像中不同物体的检测需求。



二、大核选择模块(LSK)介绍

Large Selective Kernel Network for Remote Sensing Object Detection

LSK module Large Selective Kernel Network (LSKNet) 中的核心模块,以下是对其设计的出发点、原理、结构和优势的详细解释:

2.1 出发点

  • 利用遥感图像特性 :遥感图像具有独特的特征,如从鸟瞰视角以高分辨率拍摄,其中的物体可能较小且难以仅基于外观识别,需要广泛的上下文信息进行准确检测,且不同物体所需的上下文信息范围不同。为了更好地对这些特性进行建模,提出了 LSK module
  • 结合大核与选择性机制 :大核卷积在一些研究中显示出对扩大感受野的有效性,而选择性机制可以动态地根据输入调整模型的行为。将两者结合可以使网络更好地适应遥感图像中不同物体的检测需求。

2.2 原理

2.2.1 大核卷积分解

  • 根据对遥感图像的分析,为了自适应地选择和建模多个长程上下文,将大核卷积明确分解为一系列具有逐渐增大的核和扩张率的深度卷积。
  • 对于第 i i i 个深度卷积,核大小 k i k_i k i 、扩张率 d i d_i d i 和感受野 R F i RF_i R F i 满足特定的定义关系,以确保感受野能够快速扩展,同时设置扩张率的上界以避免特征图之间出现间隙。

2.2.2 空间核选择

  • 通过将不同感受野范围的内核获得的特征进行拼接,然后应用基于通道的平均和最大池化来提取空间关系,得到平均和最大池化的空间特征描述符。
  • 将这些空间特征描述符进行拼接,并使用卷积层将其转换为 N N N 个空间注意力图。
  • 对每个空间注意力图应用sigmoid激活函数,得到每个分解后的大内核的空间选择掩码,用于对相应的特征图进行加权,然后融合得到注意力特征。

在这里插入图片描述

2.3 结构

  • 嵌入LK Selection子块 LSK module 嵌入在LSKNet的**Large Kernel Selection (LK Selection)**子块中。
  • 包含卷积和选择机制 :由一系列大核卷积和一个空间核选择机制组成。

在这里插入图片描述

2.4 优势

  • 提供多感受野特征 大核卷积 的分解明确地产生了具有各种大感受野的多个特征,这有利于后续的内核选择,能够更好地适应不同物体对不同范围上下文信息的需求。
  • 提高效率 :与直接应用单个更大的内核相比,顺序分解的方式更高效。在相同的理论感受野下,分解的设计大大减少了参数数量。
  • 有效聚焦空间上下文 :空间选择机制能够增强网络聚焦于检测目标最相关的空间上下文区域的能力,有助于提高检测性能,并且在实验中显示出比通道注意力机制更适合遥感物体检测任务。

论文: https://openaccess.thecvf.com/content/ICCV2023/papers/Li_Large_Selective_Kernel_Network_for_Remote_Sensing_Object_Detection_ICCV_2023_paper.pdf
源码: https://github.com/zcablii/Large-Selective-Kernel-Network

三、LSK的实现代码

LSK模块 的实现代码如下:

import torch
import torch.nn as nn
import torch.nn.functional as F

class LSKblock(nn.Module):
    def __init__(self, dim):
        super().__init__()
        self.conv0 = nn.Conv2d(dim, dim, 5, padding=2, groups=dim)
        self.conv_spatial = nn.Conv2d(dim, dim, 7, stride=1,
                                      padding=9, groups=dim, dilation=3)
        self.conv1 = nn.Conv2d(dim, dim // 2, 1)
        self.conv2 = nn.Conv2d(dim, dim // 2, 1)
        self.conv_squeeze = nn.Conv2d(2, 2, 7, padding=3)
        self.conv = nn.Conv2d(dim // 2, dim, 1)

    def forward(self, x):
        attn1 = self.conv0(x)
        attn2 = self.conv_spatial(attn1)

        attn1 = self.conv1(attn1)
        attn2 = self.conv2(attn2)

        attn = torch.cat([attn1, attn2], dim=1)
        avg_attn = torch.mean(attn, dim=1, keepdim=True)
        max_attn, _ = torch.max(attn, dim=1, keepdim=True)
        agg = torch.cat([avg_attn, max_attn], dim=1)
        sig = self.conv_squeeze(agg).sigmoid()
        attn = attn1 * sig[:, 0, :, :].unsqueeze(1) + \
               attn2 * sig[:, 1, :, :].unsqueeze(1)
        attn = self.conv(attn)
        return x * attn

def autopad(k, p=None, d=1):
    """
    Pads kernel to 'same' output shape, adjusting for optional dilation; returns padding size.
    `k`: kernel, `p`: padding, `d`: dilation.
    """
    if d > 1:
        k = d * (k - 1) + 1 if isinstance(k, int) else [d * (x - 1) + 1 for x in k]  # actual kernel-size
    if p is None:
        p = k // 2 if isinstance(k, int) else [x // 2 for x in k]  # auto-pad
    return p

class Conv(nn.Module):
    # Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation)
    default_act = nn.SiLU()  # default activation
 
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):
        """Initializes a standard convolution layer with optional batch normalization and activation."""
        super().__init__()
        self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groups=g, dilation=d, bias=False)
        self.bn = nn.BatchNorm2d(c2)
        self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()
 
    def forward(self, x):
        """Applies a convolution followed by batch normalization and an activation function to the input tensor `x`."""
        return self.act(self.bn(self.conv(x)))
 
    def forward_fuse(self, x):
        """Applies a fused convolution and activation function to the input tensor `x`."""
        return self.act(self.conv(x))

class ResNetBlock(nn.Module):
    """ResNet block with standard convolution layers."""

    def __init__(self, c1, c2, s=1, e=4):
        """Initialize convolution with given parameters."""
        super().__init__()
        c3 = e * c2
        self.cv1 = Conv(c1, c2, k=1, s=1, act=True)
        self.cv2 = Conv(c2, c2, k=3, s=s, p=1, act=True)
        self.cv3 = Conv(c2, c3, k=1, act=False)
        self.cv4 = LSKblock(c2)
        self.shortcut = nn.Sequential(Conv(c1, c3, k=1, s=s, act=False)) if s != 1 or c1 != c3 else nn.Identity()

    def forward(self, x):
        """Forward pass through the ResNet block."""
        return F.relu(self.cv3(self.cv4(self.cv2(self.cv1(x)))) + self.shortcut(x))

class ResNetLayer_LSKblock(nn.Module):
    """ResNet layer with multiple ResNet blocks."""

    def __init__(self, c1, c2, s=1, is_first=False, n=1, e=4):
        """Initializes the ResNetLayer given arguments."""
        super().__init__()
        self.is_first = is_first

        if self.is_first:
            self.layer = nn.Sequential(
                Conv(c1, c2, k=7, s=2, p=3, act=True), nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
            )
        else:
            blocks = [ResNetBlock(c1, c2, s, e=e)]
            blocks.extend([ResNetBlock(e * c2, c2, 1, e=e) for _ in range(n - 1)])
            self.layer = nn.Sequential(*blocks)

    def forward(self, x):
        """Forward pass through the ResNet layer."""
        return self.layer(x)


四、创新模块

4.1 改进点⭐

模块改进方法 :直接加入 LSKblock 第五节讲解添加步骤 )。

LSKblock 添加后如下:

在这里插入图片描述

4.2 改进点2⭐

模块改进方法 :基于 LSKblock模块 ResNetLayer 第五节讲解添加步骤 )。

第二种改进方法是对 ResNetLayer 进行改进。 LSKblock 在加入到 ResNetLayer 模块中后, 使模型能够产生具有各种大感受野的多个特征的同时,动态地根据输入调整模型的行为,使网络更好地适应图像中不同物体的检测需求。

改进代码如下:

首先添加 LSKblock 改进 ResNetBlock 模块。

class ResNetBlock(nn.Module):
    """ResNet block with standard convolution layers."""

    def __init__(self, c1, c2, s=1, e=4):
        """Initialize convolution with given parameters."""
        super().__init__()
        c3 = e * c2
        self.cv1 = Conv(c1, c2, k=1, s=1, act=True)
        self.cv2 = Conv(c2, c2, k=3, s=s, p=1, act=True)
        self.cv3 = Conv(c2, c3, k=1, act=False)
        self.cv4 = LSKblock(c2)
        self.shortcut = nn.Sequential(Conv(c1, c3, k=1, s=s, act=False)) if s != 1 or c1 != c3 else nn.Identity()

    def forward(self, x):
        """Forward pass through the ResNet block."""
        return F.relu(self.cv3(self.cv4(self.cv2(self.cv1(x)))) + self.shortcut(x))

在这里插入图片描述

然后将 ResNetLayer 重命名为 ResNetLayer_LSKblock

class ResNetLayer_LSKblock(nn.Module):
    """ResNet layer with multiple ResNet blocks."""

    def __init__(self, c1, c2, s=1, is_first=False, n=1, e=4):
        """Initializes the ResNetLayer given arguments."""
        super().__init__()
        self.is_first = is_first

        if self.is_first:
            self.layer = nn.Sequential(
                Conv(c1, c2, k=7, s=2, p=3, act=True), nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
            )
        else:
            blocks = [ResNetBlock(c1, c2, s, e=e)]
            blocks.extend([ResNetBlock(e * c2, c2, 1, e=e) for _ in range(n - 1)])
            self.layer = nn.Sequential(*blocks)

    def forward(self, x):
        """Forward pass through the ResNet layer."""
        return self.layer(x)

在这里插入图片描述

注意❗:在 第五小节 中需要声明的模块名称为: LSKblock ResNetLayer_LSKblock


五、添加步骤

5.1 修改一

① 在 ultralytics/nn/ 目录下新建 AddModules 文件夹用于存放模块代码

② 在 AddModules 文件夹下新建 LSKblock.py ,将 第三节 中的代码粘贴到此处

在这里插入图片描述

5.2 修改二

AddModules 文件夹下新建 __init__.py (已有则不用新建),在文件内导入模块: from .LSKblock import *

在这里插入图片描述

5.3 修改三

ultralytics/nn/modules/tasks.py 文件中,需要在指定位置添加各模块类名称。

首先:导入模块

在这里插入图片描述

其次:在 parse_model函数 中注册 LSKblock ResNetLayer_LSKblock 模块

在这里插入图片描述

parse_model函数 中添加如下代码:

elif m in {LSKblock}:
    args = [ch[f], *args]

在这里插入图片描述

elif m in {ResNetLayer, ResNetLayer_LSKblock}:
    c2 = args[1] if args[3] else args[1] * 4

在这里插入图片描述


六、yaml模型文件

6.1 模型改进版本⭐

此处以 ultralytics/cfg/models/rt-detr/rtdetr-l.yaml 为例,在同目录下创建一个用于自己数据集训练的模型文件 rtdetr-l-LSKblock.yaml

rtdetr-l.yaml 中的内容复制到 rtdetr-l-LSKblock.yaml 文件下,修改 nc 数量等于自己数据中目标的数量。

📌 模型的修改方法是将 骨干网络 中的 HGBlock模块 替换成 LSKblock模块

# Ultralytics YOLO 🚀, AGPL-3.0 license
# RT-DETR-l object detection model with P3-P5 outputs. For details see https://docs.ultralytics.com/models/rtdetr

# Parameters
nc: 1 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n-cls.yaml' will call yolov8-cls.yaml with scale 'n'
  # [depth, width, max_channels]
  l: [1.00, 1.00, 1024]

backbone:
  # [from, repeats, module, args]
  - [-1, 1, HGStem, [32, 48]] # 0-P2/4
  - [-1, 6, HGBlock, [48, 128, 3]] # stage 1

  - [-1, 1, DWConv, [128, 3, 2, 1, False]] # 2-P3/8
  - [-1, 6, HGBlock, [96, 512, 3]] # stage 2

  - [-1, 1, DWConv, [512, 3, 2, 1, False]] # 4-P4/16
  - [-1, 6, LSKblock, []] # cm, c2, k, light, shortcut
  - [-1, 6, LSKblock, []]
  - [-1, 6, LSKblock, []] # stage 3

  - [-1, 1, DWConv, [1024, 3, 2, 1, False]] # 8-P5/32
  - [-1, 6, HGBlock, [384, 2048, 5, True, False]] # stage 4

head:
  - [-1, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 10 input_proj.2
  - [-1, 1, AIFI, [1024, 8]]
  - [-1, 1, Conv, [256, 1, 1]] # 12, Y5, lateral_convs.0

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [7, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 14 input_proj.1
  - [[-2, -1], 1, Concat, [1]]
  - [-1, 3, RepC3, [256]] # 16, fpn_blocks.0
  - [-1, 1, Conv, [256, 1, 1]] # 17, Y4, lateral_convs.1

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [3, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 19 input_proj.0
  - [[-2, -1], 1, Concat, [1]] # cat backbone P4
  - [-1, 3, RepC3, [256]] # X3 (21), fpn_blocks.1

  - [-1, 1, Conv, [256, 3, 2]] # 22, downsample_convs.0
  - [[-1, 17], 1, Concat, [1]] # cat Y4
  - [-1, 3, RepC3, [256]] # F4 (24), pan_blocks.0

  - [-1, 1, Conv, [256, 3, 2]] # 25, downsample_convs.1
  - [[-1, 12], 1, Concat, [1]] # cat Y5
  - [-1, 3, RepC3, [256]] # F5 (27), pan_blocks.1

  - [[21, 24, 27], 1, RTDETRDecoder, [nc]] # Detect(P3, P4, P5)

6.2 模型改进版本2⭐

此处以 ultralytics/cfg/models/rt-detr/rtdetr-resnet50.yaml 为例,在同目录下创建一个用于自己数据集训练的模型文件 rtdetr-ResNetLayer_LSKblock.yaml

rtdetr-resnet50.yaml 中的内容复制到 rtdetr-ResNetLayer_LSKblock.yaml 文件下,修改 nc 数量等于自己数据中目标的数量。

📌 模型的修改方法是将 骨干网络 中的所有 ResNetLayer模块 替换成 ResNetLayer_LSKblock模块

# Ultralytics YOLO 🚀, AGPL-3.0 license
# RT-DETR-ResNet50 object detection model with P3-P5 outputs.

# Parameters
nc: 1 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n-cls.yaml' will call yolov8-cls.yaml with scale 'n'
  # [depth, width, max_channels]
  l: [1.00, 1.00, 1024]

backbone:
  # [from, repeats, module, args]
  - [-1, 1, ResNetLayer_LSKblock, [3, 64, 1, True, 1]] # 0
  - [-1, 1, ResNetLayer_LSKblock, [64, 64, 1, False, 3]] # 1
  - [-1, 1, ResNetLayer_LSKblock, [256, 128, 2, False, 4]] # 2
  - [-1, 1, ResNetLayer_LSKblock, [512, 256, 2, False, 6]] # 3
  - [-1, 1, ResNetLayer_LSKblock, [1024, 512, 2, False, 3]] # 4

head:
  - [-1, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 5
  - [-1, 1, AIFI, [1024, 8]]
  - [-1, 1, Conv, [256, 1, 1]] # 7

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [3, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 9
  - [[-2, -1], 1, Concat, [1]]
  - [-1, 3, RepC3, [256]] # 11
  - [-1, 1, Conv, [256, 1, 1]] # 12

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [2, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 14
  - [[-2, -1], 1, Concat, [1]] # cat backbone P4
  - [-1, 3, RepC3, [256]] # X3 (16), fpn_blocks.1

  - [-1, 1, Conv, [256, 3, 2]] # 17, downsample_convs.0
  - [[-1, 12], 1, Concat, [1]] # cat Y4
  - [-1, 3, RepC3, [256]] # F4 (19), pan_blocks.0

  - [-1, 1, Conv, [256, 3, 2]] # 20, downsample_convs.1
  - [[-1, 7], 1, Concat, [1]] # cat Y5
  - [-1, 3, RepC3, [256]] # F5 (22), pan_blocks.1

  - [[16, 19, 22], 1, RTDETRDecoder, [nc]] # Detect(P3, P4, P5)


七、成功运行结果

打印网络模型可以看到 LSKblock ResNetLayer_LSKblock 已经加入到模型中,并可以进行训练了。

rtdetr-l-LSKblock

rtdetr-l-LSKblock summary: 642 layers, 34,670,383 parameters, 34,670,383 gradients, 113.8 GFLOPs

                  from  n    params  module                                       arguments                     
  0                  -1  1     25248  ultralytics.nn.modules.block.HGStem          [3, 32, 48]                   
  1                  -1  6    155072  ultralytics.nn.modules.block.HGBlock         [48, 48, 128, 3, 6]           
  2                  -1  1      1408  ultralytics.nn.modules.conv.DWConv           [128, 128, 3, 2, 1, False]    
  3                  -1  6    839296  ultralytics.nn.modules.block.HGBlock         [128, 96, 512, 3, 6]          
  4                  -1  1      5632  ultralytics.nn.modules.conv.DWConv           [512, 512, 3, 2, 1, False]    
  5                  -1  6   2600100  ultralytics.nn.AddModules.LSKblock.LSKblock  [512]                         
  6                  -1  6   2600100  ultralytics.nn.AddModules.LSKblock.LSKblock  [512]                         
  7                  -1  6   2600100  ultralytics.nn.AddModules.LSKblock.LSKblock  [512]                         
  8                  -1  1     11264  ultralytics.nn.modules.conv.DWConv           [512, 1024, 3, 2, 1, False]   
  9                  -1  6   6708480  ultralytics.nn.modules.block.HGBlock         [1024, 384, 2048, 5, 6, True, False]
 10                  -1  1    524800  ultralytics.nn.modules.conv.Conv             [2048, 256, 1, 1, None, 1, 1, False]
 11                  -1  1    789760  ultralytics.nn.modules.transformer.AIFI      [256, 1024, 8]                
 12                  -1  1     66048  ultralytics.nn.modules.conv.Conv             [256, 256, 1, 1]              
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 14                   7  1    131584  ultralytics.nn.modules.conv.Conv             [512, 256, 1, 1, None, 1, 1, False]
 15            [-2, -1]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 16                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 17                  -1  1     66048  ultralytics.nn.modules.conv.Conv             [256, 256, 1, 1]              
 18                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 19                   3  1    131584  ultralytics.nn.modules.conv.Conv             [512, 256, 1, 1, None, 1, 1, False]
 20            [-2, -1]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 21                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 22                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
 23            [-1, 17]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 24                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 25                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
 26            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 27                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 28        [21, 24, 27]  1   7303907  ultralytics.nn.modules.head.RTDETRDecoder    [1, [256, 256, 256]]          
rtdetr-l-LSKblock summary: 642 layers, 34,670,383 parameters, 34,670,383 gradients, 113.8 GFLOPs

rtdetr-ResNetLayer_LSKblock

rtdetr-ResNetLayer_LSKblock summary: 705 layers, 44,946,691 parameters, 44,946,691 gradients, 137.2 GFLOPs

                   from  n    params  module                                       arguments                     
  0                  -1  1      9536  ultralytics.nn.AddModules.LSKblock.ResNetLayer_LSKblock[3, 64, 1, True, 1]           
  1                  -1  1    249810  ultralytics.nn.AddModules.LSKblock.ResNetLayer_LSKblock[64, 64, 1, False, 3]         
  2                  -1  1   1358616  ultralytics.nn.AddModules.LSKblock.ResNetLayer_LSKblock[256, 128, 2, False, 4]       
  3                  -1  1   7809188  ultralytics.nn.AddModules.LSKblock.ResNetLayer_LSKblock[512, 256, 2, False, 6]       
  4                  -1  1  16264786  ultralytics.nn.AddModules.LSKblock.ResNetLayer_LSKblock[1024, 512, 2, False, 3]      
  5                  -1  1    524800  ultralytics.nn.modules.conv.Conv             [2048, 256, 1, 1, None, 1, 1, False]
  6                  -1  1    789760  ultralytics.nn.modules.transformer.AIFI      [256, 1024, 8]                
  7                  -1  1     66048  ultralytics.nn.modules.conv.Conv             [256, 256, 1, 1]              
  8                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
  9                   3  1    262656  ultralytics.nn.modules.conv.Conv             [1024, 256, 1, 1, None, 1, 1, False]
 10            [-2, -1]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 11                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 12                  -1  1     66048  ultralytics.nn.modules.conv.Conv             [256, 256, 1, 1]              
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 14                   2  1    131584  ultralytics.nn.modules.conv.Conv             [512, 256, 1, 1, None, 1, 1, False]
 15            [-2, -1]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 16                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 17                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
 18            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 19                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 20                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
 21             [-1, 7]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 22                  -1  3   2232320  ultralytics.nn.modules.block.RepC3           [512, 256, 3]                 
 23        [16, 19, 22]  1   7303907  ultralytics.nn.modules.head.RTDETRDecoder    [1, [256, 256, 256]]          
rtdetr-ResNetLayer_LSKblock summary: 705 layers, 44,946,691 parameters, 44,946,691 gradients, 137.2 GFLOPs