674 lines
16 KiB
Markdown
674 lines
16 KiB
Markdown
|
|
# BEVFusion 两个版本对比分析
|
|||
|
|
|
|||
|
|
## 两个BEVFusion版本概述
|
|||
|
|
|
|||
|
|
### 1. MIT-BEVFusion (当前项目)
|
|||
|
|
- **来源**: [MIT Han Lab](https://github.com/mit-han-lab/bevfusion)
|
|||
|
|
- **论文**: "BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation" (ICRA 2023)
|
|||
|
|
- **重点**: **BEV空间融合** + **高效BEV Pooling优化**
|
|||
|
|
- **特色**: 40x速度提升的BEV pooling算子
|
|||
|
|
|
|||
|
|
### 2. ADLab-BEVFusion
|
|||
|
|
- **来源**: [ADLab-AutoDrive](https://github.com/ADLab-AutoDrive/BEVFusion)
|
|||
|
|
- **论文**: "BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework" (不同论文)
|
|||
|
|
- **重点**: **鲁棒性** + **LiDAR故障场景**
|
|||
|
|
- **特色**: 测试LiDAR失效场景下的性能
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 核心差异对比
|
|||
|
|
|
|||
|
|
| 对比项 | MIT-BEVFusion (当前) | ADLab-BEVFusion |
|
|||
|
|
|--------|---------------------|-----------------|
|
|||
|
|
| **论文重点** | BEV统一表示 + 效率优化 | 简单鲁棒的融合框架 |
|
|||
|
|
| **核心创新** | 高效BEV Pooling (40x加速) | 鲁棒的融合策略 |
|
|||
|
|
| **融合方式** | BEV空间融合 | BEV空间融合 |
|
|||
|
|
| **代码库** | 基于mmdet3d早期版本 | 基于mmdet3d 0.11.0 |
|
|||
|
|
| **mmdet版本** | 2.20.0 | 2.11.0 |
|
|||
|
|
| **PyTorch版本** | 1.9-1.10.2 | 1.7.0 |
|
|||
|
|
| **特色功能** | 多任务(检测+分割) | LiDAR故障测试 |
|
|||
|
|
| **BEV Pooling** | 自定义CUDA优化算子 | 标准实现 |
|
|||
|
|
| **训练流程** | 端到端 | 分阶段(Camera→LiDAR→Fusion) |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 目录结构对比
|
|||
|
|
|
|||
|
|
### MIT-BEVFusion (当前项目)
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
bevfusion/
|
|||
|
|
├── configs/ 配置文件
|
|||
|
|
│ ├── default.yaml
|
|||
|
|
│ └── nuscenes/
|
|||
|
|
│ ├── det/ 检测配置
|
|||
|
|
│ └── seg/ 分割配置
|
|||
|
|
├── mmdet3d/ 核心代码
|
|||
|
|
│ ├── models/
|
|||
|
|
│ │ ├── fusers/ 融合模块
|
|||
|
|
│ │ │ ├── conv.py ConvFuser
|
|||
|
|
│ │ │ └── add.py AddFuser
|
|||
|
|
│ │ ├── vtransforms/ 视图转换
|
|||
|
|
│ │ │ ├── lss.py
|
|||
|
|
│ │ │ └── depth_lss.py
|
|||
|
|
│ │ └── fusion_models/
|
|||
|
|
│ │ └── bevfusion.py
|
|||
|
|
│ ├── ops/ CUDA算子
|
|||
|
|
│ │ ├── bev_pool/ ★ 高效BEV pooling
|
|||
|
|
│ │ ├── spconv/ 稀疏卷积
|
|||
|
|
│ │ └── voxel/ 体素化
|
|||
|
|
│ └── ...
|
|||
|
|
├── tools/
|
|||
|
|
│ ├── train.py 训练脚本
|
|||
|
|
│ └── test.py 测试脚本
|
|||
|
|
└── docker/ Docker配置
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### ADLab-BEVFusion
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
BEVFusion/
|
|||
|
|
├── configs/ 配置文件
|
|||
|
|
│ ├── bevfusion/ BEVFusion配置
|
|||
|
|
│ │ ├── cam_stream/ 相机分支训练
|
|||
|
|
│ │ ├── lidar_stream/ LiDAR分支训练
|
|||
|
|
│ │ ├── drop_fov/ FOV受限测试
|
|||
|
|
│ │ └── drop_bbox/ 物体失效测试
|
|||
|
|
├── mmdet3d/ 核心代码
|
|||
|
|
│ └── models/
|
|||
|
|
│ ├── detectors/ 检测器
|
|||
|
|
│ └── ...
|
|||
|
|
├── mmcv_custom/ 自定义mmcv组件
|
|||
|
|
├── mmdetection-2.11.0/ 内嵌mmdetection
|
|||
|
|
├── requirements/ 依赖管理
|
|||
|
|
├── tests/ 测试代码
|
|||
|
|
└── tools/
|
|||
|
|
├── dist_train.sh 分布式训练脚本
|
|||
|
|
└── dist_test.sh 分布式测试脚本
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 关键技术差异
|
|||
|
|
|
|||
|
|
### 1. BEV Pooling实现
|
|||
|
|
|
|||
|
|
#### MIT版本 (当前)
|
|||
|
|
```
|
|||
|
|
高效BEV Pooling算子:
|
|||
|
|
位置: mmdet3d/ops/bev_pool/
|
|||
|
|
实现: 自定义CUDA kernel
|
|||
|
|
性能: 相比原始LSS快40x
|
|||
|
|
|
|||
|
|
代码:
|
|||
|
|
from mmdet3d.ops import bev_pool_v2
|
|||
|
|
output = bev_pool_v2(depth, features, ranks, ...)
|
|||
|
|
|
|||
|
|
优势:
|
|||
|
|
✅ 极致优化的CUDA实现
|
|||
|
|
✅ 内存和速度双优
|
|||
|
|
✅ 支持FP16
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### ADLab版本
|
|||
|
|
```
|
|||
|
|
标准BEV Pooling:
|
|||
|
|
使用标准的PyTorch操作
|
|||
|
|
性能: 标准速度
|
|||
|
|
|
|||
|
|
优势:
|
|||
|
|
✅ 代码简单易懂
|
|||
|
|
✅ 易于修改和扩展
|
|||
|
|
✅ 依赖少
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2. 训练策略
|
|||
|
|
|
|||
|
|
#### MIT版本 (当前)
|
|||
|
|
```
|
|||
|
|
端到端训练:
|
|||
|
|
1. 同时训练camera和lidar encoder
|
|||
|
|
2. 使用预训练的lidar-only模型初始化
|
|||
|
|
|
|||
|
|
命令:
|
|||
|
|
torchpack dist-run -np 8 python tools/train.py config.yaml \
|
|||
|
|
--model.encoders.camera.backbone.init_cfg.checkpoint camera_pretrain.pth \
|
|||
|
|
--load_from lidar-only-det.pth
|
|||
|
|
|
|||
|
|
特点:
|
|||
|
|
- 一次训练完成
|
|||
|
|
- 简单直接
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### ADLab版本
|
|||
|
|
```
|
|||
|
|
分阶段训练(推荐流程):
|
|||
|
|
阶段1: 训练camera stream (nuImage数据集)
|
|||
|
|
./tools/dist_train.sh configs/bevfusion/cam_stream/mask_rcnn_*.py 8
|
|||
|
|
|
|||
|
|
阶段2: 训练camera BEV分支
|
|||
|
|
./tools/dist_train.sh configs/bevfusion/cam_stream/bevf_pp_*_cam.py 8
|
|||
|
|
|
|||
|
|
阶段3: 训练LiDAR stream
|
|||
|
|
./tools/dist_train.sh configs/bevfusion/lidar_stream/hv_pointpillars_*.py 8
|
|||
|
|
|
|||
|
|
阶段4: 融合训练
|
|||
|
|
./tools/dist_train.sh configs/bevfusion/bevf_pp_*.py 8
|
|||
|
|
|
|||
|
|
特点:
|
|||
|
|
- 更稳定
|
|||
|
|
- 每个阶段可以单独调试
|
|||
|
|
- 适合工业应用
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3. 配置系统
|
|||
|
|
|
|||
|
|
#### MIT版本 (当前)
|
|||
|
|
```yaml
|
|||
|
|
# 使用torchpack配置系统
|
|||
|
|
# YAML格式,支持变量替换
|
|||
|
|
|
|||
|
|
model:
|
|||
|
|
encoders:
|
|||
|
|
camera: ${camera_config}
|
|||
|
|
lidar: ${lidar_config}
|
|||
|
|
fuser:
|
|||
|
|
type: ConvFuser
|
|||
|
|
heads:
|
|||
|
|
object: ${detection_config}
|
|||
|
|
map: ${segmentation_config}
|
|||
|
|
|
|||
|
|
# 使用${}语法引用变量
|
|||
|
|
point_cloud_range: ${point_cloud_range}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### ADLab版本
|
|||
|
|
```python
|
|||
|
|
# 使用mmdetection配置系统
|
|||
|
|
# Python格式配置文件
|
|||
|
|
|
|||
|
|
_base_ = [
|
|||
|
|
'../_base_/models/bevfusion.py',
|
|||
|
|
'../_base_/datasets/nus-3d.py',
|
|||
|
|
'../_base_/schedules/schedule_2x.py',
|
|||
|
|
]
|
|||
|
|
|
|||
|
|
# 直接Python代码配置
|
|||
|
|
model = dict(
|
|||
|
|
type='BEVFusion',
|
|||
|
|
pts_voxel_layer=dict(...),
|
|||
|
|
pts_bbox_head=dict(...),
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 4. 特色功能
|
|||
|
|
|
|||
|
|
#### MIT版本 (当前)
|
|||
|
|
```
|
|||
|
|
1. 多任务支持
|
|||
|
|
- 同时支持3D检测和BEV分割
|
|||
|
|
- 共享backbone
|
|||
|
|
|
|||
|
|
2. 高效算子
|
|||
|
|
- 优化的BEV pooling (CUDA)
|
|||
|
|
- 优化的稀疏卷积
|
|||
|
|
|
|||
|
|
3. 灵活配置
|
|||
|
|
- 支持多种backbone (ResNet, SwinTransformer, VoVNet)
|
|||
|
|
- 支持多种vtransform (LSS, DepthLSS, BEVDepth)
|
|||
|
|
|
|||
|
|
4. 性能基准
|
|||
|
|
Waymo排行榜第一
|
|||
|
|
nuScenes检测和分割都是第一
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### ADLab版本
|
|||
|
|
```
|
|||
|
|
1. 鲁棒性测试
|
|||
|
|
- LiDAR FOV受限场景
|
|||
|
|
- LiDAR物体失效场景
|
|||
|
|
- 评估融合框架的鲁棒性
|
|||
|
|
|
|||
|
|
2. 实用性优化
|
|||
|
|
- 简单的训练流程
|
|||
|
|
- 工业级实现
|
|||
|
|
|
|||
|
|
3. 多种配置
|
|||
|
|
- PointPillars版本
|
|||
|
|
- CenterPoint版本
|
|||
|
|
- TransFusion版本
|
|||
|
|
|
|||
|
|
4. 实验设置
|
|||
|
|
FOV受限: (-π/3, π/3) 或 (-π/2, π/2)
|
|||
|
|
物体失效: 随机drop 50%前景物体
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 性能对比
|
|||
|
|
|
|||
|
|
### MIT-BEVFusion (当前项目结果)
|
|||
|
|
|
|||
|
|
**nuScenes Validation**:
|
|||
|
|
| 模型 | 模态 | mAP | NDS |
|
|||
|
|
|------|------|-----|-----|
|
|||
|
|
| BEVFusion | C+L | 68.52 | 71.38 |
|
|||
|
|
| Camera-Only | C | 35.56 | 41.21 |
|
|||
|
|
| LiDAR-Only | L | 64.68 | 69.28 |
|
|||
|
|
|
|||
|
|
**BEV分割**:
|
|||
|
|
| 模型 | mIoU |
|
|||
|
|
|------|------|
|
|||
|
|
| BEVFusion | 62.95% |
|
|||
|
|
|
|||
|
|
### ADLab-BEVFusion (从网站数据)
|
|||
|
|
|
|||
|
|
**nuScenes Validation**:
|
|||
|
|
| 模型 | Head | 3D Backbone | 2D Backbone | mAP | NDS |
|
|||
|
|
|------|------|-------------|-------------|-----|-----|
|
|||
|
|
| BEVFusion | PointPillars | PointPillars | Dual-Swin-T | 52.9 | 61.6 |
|
|||
|
|
| BEVFusion | CenterPoint | VoxelNet | Dual-Swin-T | 60.9 | 67.5 |
|
|||
|
|
| BEVFusion* | TransFusion-L | VoxelNet | Dual-Swin-T | **69.6** | **72.1** |
|
|||
|
|
|
|||
|
|
*使用BEV空间数据增强
|
|||
|
|
|
|||
|
|
**LiDAR故障场景**:
|
|||
|
|
| 场景 | mAP | NDS |
|
|||
|
|
|------|-----|-----|
|
|||
|
|
| FOV限制 (-π/3,π/3) | 41.5 | 50.8 |
|
|||
|
|
| FOV限制 (-π/2,π/2) | 46.4 | 55.8 |
|
|||
|
|
| 50%物体失效 | 50.3 | 57.6 |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 代码实现差异
|
|||
|
|
|
|||
|
|
### 1. 模型架构文件位置
|
|||
|
|
|
|||
|
|
#### MIT版本 (当前)
|
|||
|
|
```
|
|||
|
|
mmdet3d/models/
|
|||
|
|
├── fusion_models/
|
|||
|
|
│ ├── base.py
|
|||
|
|
│ └── bevfusion.py ← 主模型
|
|||
|
|
├── fusers/
|
|||
|
|
│ ├── conv.py ← ConvFuser
|
|||
|
|
│ └── add.py ← AddFuser
|
|||
|
|
├── vtransforms/
|
|||
|
|
│ ├── lss.py
|
|||
|
|
│ ├── depth_lss.py
|
|||
|
|
│ └── aware_bevdepth.py
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### ADLab版本
|
|||
|
|
```
|
|||
|
|
mmdet3d/models/
|
|||
|
|
├── detectors/
|
|||
|
|
│ └── bevfusion.py ← 继承自mvx_two_stage
|
|||
|
|
├── fusion_layers/
|
|||
|
|
│ └── point_fusion.py ← 点云融合
|
|||
|
|
├── backbones/
|
|||
|
|
│ └── dual_swin.py ← Dual-Swin backbone
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2. 配置文件组织
|
|||
|
|
|
|||
|
|
#### MIT版本 (当前)
|
|||
|
|
```
|
|||
|
|
configs/
|
|||
|
|
├── default.yaml 全局配置
|
|||
|
|
└── nuscenes/
|
|||
|
|
├── det/ 检测配置
|
|||
|
|
│ ├── centerhead/
|
|||
|
|
│ └── transfusion/
|
|||
|
|
│ └── secfpn/
|
|||
|
|
│ ├── camera/ 单camera
|
|||
|
|
│ ├── lidar/ 单lidar
|
|||
|
|
│ └── camera+lidar/ 融合
|
|||
|
|
└── seg/ 分割配置
|
|||
|
|
├── camera-bev256d2.yaml
|
|||
|
|
└── fusion-bev256d2-lss.yaml
|
|||
|
|
|
|||
|
|
特点:
|
|||
|
|
- YAML格式
|
|||
|
|
- 变量替换系统 (${variable})
|
|||
|
|
- 按任务组织
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### ADLab版本
|
|||
|
|
```
|
|||
|
|
configs/
|
|||
|
|
├── _base_/ 基础配置
|
|||
|
|
│ ├── models/
|
|||
|
|
│ ├── datasets/
|
|||
|
|
│ └── schedules/
|
|||
|
|
└── bevfusion/
|
|||
|
|
├── cam_stream/ 相机训练配置
|
|||
|
|
│ ├── mask_rcnn_*.py 2D检测
|
|||
|
|
│ └── bevf_pp_*_cam.py BEV camera
|
|||
|
|
├── lidar_stream/ LiDAR训练配置
|
|||
|
|
│ └── hv_pointpillars_*.py
|
|||
|
|
├── bevf_pp_*.py 融合配置(PointPillars)
|
|||
|
|
├── bevf_cp_*.py 融合配置(CenterPoint)
|
|||
|
|
├── bevf_tf_*.py 融合配置(TransFusion)
|
|||
|
|
├── drop_fov/ FOV受限测试
|
|||
|
|
└── drop_bbox/ 物体失效测试
|
|||
|
|
|
|||
|
|
特点:
|
|||
|
|
- Python格式
|
|||
|
|
- mmdetection标准配置继承
|
|||
|
|
- 按训练阶段组织
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 训练流程对比
|
|||
|
|
|
|||
|
|
### MIT版本 (当前) - 端到端
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 一步到位
|
|||
|
|
torchpack dist-run -np 8 python tools/train.py \
|
|||
|
|
configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml \
|
|||
|
|
--model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth \
|
|||
|
|
--load_from pretrained/lidar-only-det.pth
|
|||
|
|
|
|||
|
|
优点:
|
|||
|
|
✅ 简单,一条命令
|
|||
|
|
✅ 快速开始
|
|||
|
|
|
|||
|
|
缺点:
|
|||
|
|
⚠️ 如果失败,整个流程重来
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### ADLab版本 - 渐进式
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 步骤1: 训练2D检测backbone (nuImage数据集)
|
|||
|
|
./tools/dist_train.sh \
|
|||
|
|
configs/bevfusion/cam_stream/mask_rcnn_dbswin-t_fpn_3x_nuim_cocopre.py 8
|
|||
|
|
|
|||
|
|
# 步骤2: 训练camera BEV分支
|
|||
|
|
./tools/dist_train.sh \
|
|||
|
|
configs/bevfusion/cam_stream/bevf_pp_4x8_2x_nusc_cam.py 8
|
|||
|
|
|
|||
|
|
# 步骤3: 训练LiDAR分支
|
|||
|
|
./tools/dist_train.sh \
|
|||
|
|
configs/bevfusion/lidar_stream/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d.py 8
|
|||
|
|
|
|||
|
|
# 步骤4: 融合训练
|
|||
|
|
./tools/dist_train.sh \
|
|||
|
|
configs/bevfusion/bevf_pp_2x8_1x_nusc.py 8
|
|||
|
|
|
|||
|
|
优点:
|
|||
|
|
✅ 每个阶段独立,易于调试
|
|||
|
|
✅ 更稳定
|
|||
|
|
✅ 可以单独优化每个阶段
|
|||
|
|
|
|||
|
|
缺点:
|
|||
|
|
⚠️ 需要多个步骤
|
|||
|
|
⚠️ 总时间更长
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 关键代码差异
|
|||
|
|
|
|||
|
|
### BEV Pooling算子
|
|||
|
|
|
|||
|
|
#### MIT版本 (当前) - 高度优化
|
|||
|
|
```python
|
|||
|
|
# mmdet3d/ops/bev_pool/bev_pool.py
|
|||
|
|
from . import bev_pool_v2 # CUDA扩展
|
|||
|
|
|
|||
|
|
def bev_pool(depth, feat, ranks_depth, ranks_feat, ...):
|
|||
|
|
# 使用优化的CUDA kernel
|
|||
|
|
output = bev_pool_v2(depth, feat, ranks_depth, ...)
|
|||
|
|
return output
|
|||
|
|
|
|||
|
|
性能:
|
|||
|
|
- 速度: 相比原始LSS快40x
|
|||
|
|
- 内存: 优化的内存管理
|
|||
|
|
- 实现: C++/CUDA
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### ADLab版本 - 标准实现
|
|||
|
|
```python
|
|||
|
|
# 使用标准PyTorch操作
|
|||
|
|
def bev_pool(depth, feat, geometry):
|
|||
|
|
# 标准的scatter和gather操作
|
|||
|
|
bev_feat = torch.zeros(...)
|
|||
|
|
for d in depth_bins:
|
|||
|
|
# 标准PyTorch实现
|
|||
|
|
bev_feat += scatter_nd(...)
|
|||
|
|
return bev_feat
|
|||
|
|
|
|||
|
|
性能:
|
|||
|
|
- 速度: 标准PyTorch速度
|
|||
|
|
- 内存: 标准
|
|||
|
|
- 实现: 纯Python/PyTorch
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Fuser实现
|
|||
|
|
|
|||
|
|
#### MIT版本 (当前) - 模块化
|
|||
|
|
```python
|
|||
|
|
# mmdet3d/models/fusers/conv.py
|
|||
|
|
@FUSERS.register_module()
|
|||
|
|
class ConvFuser(nn.Sequential):
|
|||
|
|
def __init__(self, in_channels, out_channels):
|
|||
|
|
super().__init__(
|
|||
|
|
nn.Conv2d(sum(in_channels), out_channels, 3, padding=1),
|
|||
|
|
nn.BatchNorm2d(out_channels),
|
|||
|
|
nn.ReLU(True),
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
def forward(self, inputs):
|
|||
|
|
return super().forward(torch.cat(inputs, dim=1))
|
|||
|
|
|
|||
|
|
使用:
|
|||
|
|
model:
|
|||
|
|
fuser:
|
|||
|
|
type: ConvFuser # 或 AddFuser
|
|||
|
|
in_channels: [80, 256]
|
|||
|
|
out_channels: 256
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### ADLab版本 - 集成在模型中
|
|||
|
|
```python
|
|||
|
|
# 融合逻辑直接在BEVFusion模型类中
|
|||
|
|
class BEVFusion(MVXTwoStageDetector):
|
|||
|
|
def forward(self, ...):
|
|||
|
|
# camera features
|
|||
|
|
cam_bev = self.extract_cam_feat(...)
|
|||
|
|
# lidar features
|
|||
|
|
lidar_bev = self.extract_pts_feat(...)
|
|||
|
|
# 融合
|
|||
|
|
fused = self.fusion_layer(cam_bev, lidar_bev)
|
|||
|
|
...
|
|||
|
|
|
|||
|
|
特点:
|
|||
|
|
- 融合逻辑耦合在模型中
|
|||
|
|
- 不易切换融合策略
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 特色功能对比
|
|||
|
|
|
|||
|
|
### MIT版本 (当前) - 多任务
|
|||
|
|
|
|||
|
|
```yaml
|
|||
|
|
# 支持同时训练检测和分割
|
|||
|
|
model:
|
|||
|
|
heads:
|
|||
|
|
object:
|
|||
|
|
type: TransFusionHead
|
|||
|
|
# 检测配置
|
|||
|
|
map:
|
|||
|
|
type: BEVSegmentationHead
|
|||
|
|
# 分割配置
|
|||
|
|
|
|||
|
|
一个模型输出:
|
|||
|
|
- 3D检测框
|
|||
|
|
- BEV分割图
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### ADLab版本 - 鲁棒性测试
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# drop_fov/: LiDAR FOV受限测试
|
|||
|
|
configs/bevfusion/drop_fov/fov60_bevf_tf_*.py
|
|||
|
|
configs/bevfusion/drop_fov/fov90_bevf_tf_*.py
|
|||
|
|
|
|||
|
|
# drop_bbox/: 前景物体失效测试
|
|||
|
|
configs/bevfusion/drop_bbox/halfbox_bevf_tf_*.py
|
|||
|
|
|
|||
|
|
测试场景:
|
|||
|
|
1. LiDAR FOV从360°缩减到60°或90°
|
|||
|
|
2. 随机drop 50%的前景物体点云
|
|||
|
|
3. 评估融合框架的鲁棒性
|
|||
|
|
|
|||
|
|
结果:
|
|||
|
|
- FOV 60°时: mAP 41.5 (vs 正常69.6)
|
|||
|
|
- 50%物体失效: mAP 50.3 (vs 正常69.6)
|
|||
|
|
|
|||
|
|
证明相机可以补偿LiDAR的失效
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 依赖差异
|
|||
|
|
|
|||
|
|
### MIT版本 (当前)
|
|||
|
|
```
|
|||
|
|
Python >= 3.8, < 3.9
|
|||
|
|
PyTorch >= 1.9, <= 1.10.2
|
|||
|
|
mmcv = 1.4.0
|
|||
|
|
mmdet = 2.20.0
|
|||
|
|
torchpack (必需)
|
|||
|
|
CUDA 11.3
|
|||
|
|
|
|||
|
|
特点:
|
|||
|
|
- 较新的PyTorch版本
|
|||
|
|
- 需要torchpack
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### ADLab版本
|
|||
|
|
```
|
|||
|
|
Python = 3.8.3
|
|||
|
|
PyTorch = 1.7.0
|
|||
|
|
mmcv = 1.4.0
|
|||
|
|
mmdet = 2.11.0 (内嵌在项目中)
|
|||
|
|
不需要torchpack
|
|||
|
|
CUDA 10.2/11.0
|
|||
|
|
|
|||
|
|
特点:
|
|||
|
|
- 较旧但稳定的PyTorch版本
|
|||
|
|
- 内嵌mmdetection,依赖管理更简单
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 适用场景
|
|||
|
|
|
|||
|
|
### MIT-BEVFusion (当前) 适合:
|
|||
|
|
|
|||
|
|
1. **研究和论文复现**
|
|||
|
|
- 追求最先进性能
|
|||
|
|
- 需要多任务支持
|
|||
|
|
- 关注推理速度
|
|||
|
|
|
|||
|
|
2. **边缘部署**
|
|||
|
|
- 利用优化的BEV pooling
|
|||
|
|
- 需要实时性能
|
|||
|
|
- TensorRT部署
|
|||
|
|
|
|||
|
|
3. **快速实验**
|
|||
|
|
- 端到端训练
|
|||
|
|
- 快速迭代
|
|||
|
|
- 简单配置
|
|||
|
|
|
|||
|
|
### ADLab-BEVFusion 适合:
|
|||
|
|
|
|||
|
|
1. **工业应用**
|
|||
|
|
- 需要稳定性和鲁棒性
|
|||
|
|
- 多阶段训练更可控
|
|||
|
|
- 需要测试传感器失效场景
|
|||
|
|
|
|||
|
|
2. **传感器失效研究**
|
|||
|
|
- 研究LiDAR故障场景
|
|||
|
|
- 评估融合鲁棒性
|
|||
|
|
- 安全关键系统
|
|||
|
|
|
|||
|
|
3. **教学和学习**
|
|||
|
|
- 代码结构更标准(遵循mmdet3d规范)
|
|||
|
|
- 易于理解和修改
|
|||
|
|
- 不需要torchpack
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 目录结构详细对比
|
|||
|
|
|
|||
|
|
### 当前项目(MIT)独有的目录
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
mmdet3d/ops/bev_pool/ ★ 高效BEV pooling CUDA算子
|
|||
|
|
mmdet3d/models/vtransforms/ 视图转换模块(支持多种)
|
|||
|
|
mmdet3d/models/fusers/ 独立的融合模块
|
|||
|
|
mmdet3d/models/fusion_models/ 融合模型基类
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### ADLab版本独有的目录
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
.dev_scripts/ 开发脚本
|
|||
|
|
mmcv_custom/ 自定义mmcv组件
|
|||
|
|
mmdetection-2.11.0/ 内嵌的mmdetection
|
|||
|
|
requirements/ 详细的依赖管理
|
|||
|
|
├── build.txt
|
|||
|
|
├── optional.txt
|
|||
|
|
├── runtime.txt
|
|||
|
|
└── tests.txt
|
|||
|
|
tests/ 完整的测试套件
|
|||
|
|
demo/ 演示脚本
|
|||
|
|
docs/ 完整文档
|
|||
|
|
configs/bevfusion/
|
|||
|
|
├── drop_fov/ ★ FOV受限测试
|
|||
|
|
└── drop_bbox/ ★ 物体失效测试
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 训练脚本差异
|
|||
|
|
|
|||
|
|
### MIT版本 (当前)
|
|||
|
|
```bash
|
|||
|
|
# 使用torchpack
|
|||
|
|
torchpack dist-run -np 8 python tools/train.py config.yaml
|
|||
|
|
|
|||
|
|
# 直接使用Python
|
|||
|
|
python tools/train.py config.yaml
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### ADLab版本
|
|||
|
|
```bash
|
|||
|
|
# 使用标准的bash脚本
|
|||
|
|
./tools/dist_train.sh config.py 8
|
|||
|
|
|
|||
|
|
# 内部调用
|
|||
|
|
python -m torch.distributed.launch \
|
|||
|
|
--nproc_per_node=8 \
|
|||
|
|
tools/train.py config.py
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 当前多任务训练状态确认
|
|||
|
|
|
|||
|
|
让我检查一下训练是否还在运行:
|
|||
|
|
|
|||
|
|
<function_calls>
|
|||
|
|
<invoke name="run_terminal_cmd">
|
|||
|
|
<parameter name="command">tail -20 /workspace/bevfusion/training_multitask_v4.log && echo "" && echo "---GPU状态---" && nvidia-smi --query-gpu=index,utilization.gpu,memory.used --format=csv,noheader
|