927 lines
26 KiB
Markdown
927 lines
26 KiB
Markdown
# nuScenes数据格式与实车部署标注指南
|
||
|
||
## 📋 目录
|
||
1. [nuScenes GT数据格式详解](#1-nuscenes-gt数据格式详解)
|
||
2. [BEVFusion输入输出格式](#2-bevfusion输入输出格式)
|
||
3. [实车相机配置变动方案](#3-实车相机配置变动方案)
|
||
4. [数据采集与标注流程](#4-数据采集与标注流程)
|
||
5. [工具与代码示例](#5-工具与代码示例)
|
||
|
||
---
|
||
|
||
## 1. nuScenes GT数据格式详解
|
||
|
||
### 1.1 数据组织结构
|
||
|
||
nuScenes数据集按照以下层级组织:
|
||
|
||
```
|
||
nuscenes/
|
||
├── v1.0-trainval/ # 元数据(JSON格式)
|
||
│ ├── sample.json # 关键帧样本
|
||
│ ├── sample_data.json # 传感器数据
|
||
│ ├── scene.json # 场景信息
|
||
│ ├── calibrated_sensor.json # 传感器标定
|
||
│ ├── ego_pose.json # 车辆位姿
|
||
│ ├── instance.json # 目标实例
|
||
│ ├── category.json # 目标类别
|
||
│ └── attribute.json # 目标属性
|
||
├── samples/ # 关键帧数据
|
||
│ ├── CAM_FRONT/
|
||
│ ├── CAM_FRONT_LEFT/
|
||
│ ├── CAM_FRONT_RIGHT/
|
||
│ ├── CAM_BACK/
|
||
│ ├── CAM_BACK_LEFT/
|
||
│ ├── CAM_BACK_RIGHT/
|
||
│ ├── LIDAR_TOP/
|
||
│ └── RADAR_FRONT/
|
||
├── sweeps/ # 中间帧数据
|
||
└── maps/ # HD地图数据
|
||
├── expansion/
|
||
└── basemap/
|
||
```
|
||
|
||
### 1.2 单个Sample的数据结构
|
||
|
||
BEVFusion使用的处理后的info格式 (pkl文件):
|
||
|
||
```python
|
||
{
|
||
'token': 'ca9a282c9e77460f8360f564131a8af5', # 样本唯一ID
|
||
'timestamp': 1532402927647951, # 时间戳(微秒)
|
||
'scene_token': 'scene-0001', # 场景ID
|
||
|
||
# ========== LiDAR数据 ==========
|
||
'lidar_path': 'samples/LIDAR_TOP/n015-2018-07-24-11-22-45+0800__LIDAR_TOP__1532402927647951.pcd.bin',
|
||
'sweeps': [ # 多帧LiDAR数据
|
||
{
|
||
'data_path': 'sweeps/LIDAR_TOP/...',
|
||
'sensor2lidar_rotation': [[...], [...], [...]], # 3x3旋转矩阵
|
||
'sensor2lidar_translation': [x, y, z],
|
||
'timestamp': 1532402927547951
|
||
},
|
||
# ... 共9帧sweep
|
||
],
|
||
|
||
# ========== LiDAR位姿 ==========
|
||
'lidar2ego_translation': [x, y, z], # LiDAR到车体坐标系
|
||
'lidar2ego_rotation': [w, x, y, z], # 四元数
|
||
'ego2global_translation': [x, y, z], # 车体到全局坐标系
|
||
'ego2global_rotation': [w, x, y, z],
|
||
|
||
# ========== 相机数据 (6个相机) ==========
|
||
'cams': {
|
||
'CAM_FRONT': {
|
||
'data_path': 'samples/CAM_FRONT/...',
|
||
'cam_intrinsic': [ # 3x3内参矩阵
|
||
[fx, 0, cx],
|
||
[0, fy, cy],
|
||
[0, 0, 1]
|
||
],
|
||
'sensor2ego_translation': [x, y, z], # 相机到车体
|
||
'sensor2ego_rotation': [w, x, y, z],
|
||
'sensor2lidar_rotation': [[...], [...], [...]], # 相机到LiDAR
|
||
'sensor2lidar_translation': [x, y, z],
|
||
'timestamp': 1532402927612460
|
||
},
|
||
'CAM_FRONT_RIGHT': { ... },
|
||
'CAM_FRONT_LEFT': { ... },
|
||
'CAM_BACK': { ... },
|
||
'CAM_BACK_LEFT': { ... },
|
||
'CAM_BACK_RIGHT': { ... }
|
||
},
|
||
|
||
# ========== 3D目标检测标注 ==========
|
||
'gt_boxes': [ # N x 9数组
|
||
# [x, y, z, w, l, h, yaw, vx, vy]
|
||
# x,y,z: 中心点(全局坐标)
|
||
# w,l,h: 宽度,长度,高度(米)
|
||
# yaw: 偏航角(弧度)
|
||
# vx,vy: 速度(m/s)
|
||
],
|
||
'gt_names': ['car', 'pedestrian', ...], # 类别名称
|
||
'gt_velocity': [[vx, vy], ...], # 速度(m/s)
|
||
'num_lidar_pts': [34, 12, ...], # 每个box内的点数
|
||
'num_radar_pts': [5, 2, ...], # 雷达点数
|
||
'valid_flag': [True, True, ...], # 是否有效
|
||
|
||
# ========== 目标类别(10类) ==========
|
||
# car, truck, construction_vehicle, bus, trailer,
|
||
# barrier, motorcycle, bicycle, pedestrian, traffic_cone
|
||
}
|
||
```
|
||
|
||
### 1.3 BEV语义分割标注
|
||
|
||
BEV分割标注从nuScenes地图数据动态生成:
|
||
|
||
```python
|
||
# 在LoadBEVSegmentation中实时生成
|
||
gt_masks_bev: np.ndarray # (num_classes, H, W)
|
||
# H=200, W=200 (对应100m x 100m, 分辨率0.5m)
|
||
# num_classes=6:
|
||
# [0] drivable_area - 可行驶区域
|
||
# [1] ped_crossing - 人行横道
|
||
# [2] walkway - 人行道
|
||
# [3] stop_line - 停止线
|
||
# [4] carpark_area - 停车区
|
||
# [5] divider - 分隔线
|
||
# 值: 0/1 (二值mask)
|
||
```
|
||
|
||
**生成过程**:
|
||
1. 从nuScenes Map API获取当前位置的地图数据
|
||
2. 根据LiDAR位姿确定BEV范围 (默认50m x 50m)
|
||
3. 渲染6个语义类别到200x200网格
|
||
4. 进行坐标变换和旋转对齐
|
||
|
||
|
||
---
|
||
|
||
## 5. 工具与代码示例
|
||
|
||
### 5.1 数据转换工具集
|
||
|
||
#### 5.1.1 自定义数据转nuScenes格式
|
||
|
||
```python
|
||
#!/usr/bin/env python3
|
||
# tools/data_converter/custom_to_nuscenes.py
|
||
|
||
import os
|
||
import pickle
|
||
import numpy as np
|
||
from tqdm import tqdm
|
||
from pyquaternion import Quaternion
|
||
from pathlib import Path
|
||
import json
|
||
|
||
class CustomDataConverter:
|
||
"""将实车采集数据转换为nuScenes格式"""
|
||
|
||
def __init__(
|
||
self,
|
||
custom_data_root: str,
|
||
output_root: str,
|
||
camera_names: list = None,
|
||
):
|
||
self.custom_root = Path(custom_data_root)
|
||
self.output_root = Path(output_root)
|
||
|
||
# 默认相机配置
|
||
self.camera_names = camera_names or [
|
||
'CAM_FRONT',
|
||
'CAM_FRONT_LEFT',
|
||
'CAM_FRONT_RIGHT',
|
||
'CAM_BACK'
|
||
]
|
||
|
||
# 加载标定数据
|
||
self.calib = self.load_calibration()
|
||
|
||
def load_calibration(self):
|
||
"""加载相机和LiDAR标定参数"""
|
||
calib_file = self.custom_root / 'calibration' / 'sensors.json'
|
||
with open(calib_file, 'r') as f:
|
||
calib = json.load(f)
|
||
return calib
|
||
|
||
def convert_sample(self, sample_id: str):
|
||
"""转换单个样本"""
|
||
|
||
# 1. 加载LiDAR数据
|
||
lidar_path = self.custom_root / 'lidar' / f'{sample_id}.pcd'
|
||
|
||
# 2. 加载相机图像
|
||
cam_data = {}
|
||
for cam_name in self.camera_names:
|
||
img_path = self.custom_root / 'camera' / cam_name / f'{sample_id}.jpg'
|
||
|
||
# 获取该相机的标定参数
|
||
cam_calib = self.calib['cameras'][cam_name]
|
||
|
||
# 计算变换矩阵
|
||
cam2ego = self._build_transform_matrix(
|
||
cam_calib['translation'],
|
||
cam_calib['rotation']
|
||
)
|
||
|
||
ego2lidar_inv = self._build_transform_matrix(
|
||
self.calib['lidar']['translation'],
|
||
self.calib['lidar']['rotation']
|
||
)
|
||
ego2lidar = np.linalg.inv(ego2lidar_inv)
|
||
|
||
cam2lidar = ego2lidar @ cam2ego
|
||
|
||
cam_data[cam_name] = {
|
||
'data_path': str(img_path.relative_to(self.output_root)),
|
||
'cam_intrinsic': np.array(cam_calib['intrinsic']),
|
||
'sensor2ego_translation': cam_calib['translation'],
|
||
'sensor2ego_rotation': cam_calib['rotation'],
|
||
'sensor2lidar_rotation': cam2lidar[:3, :3],
|
||
'sensor2lidar_translation': cam2lidar[:3, 3],
|
||
'timestamp': self._get_timestamp(sample_id)
|
||
}
|
||
|
||
# 3. 加载标注 (如果存在)
|
||
anno_file = self.custom_root / 'annotations' / f'{sample_id}.json'
|
||
if anno_file.exists():
|
||
with open(anno_file, 'r') as f:
|
||
annos = json.load(f)
|
||
|
||
gt_boxes, gt_names, gt_velocity = self._parse_annotations(annos)
|
||
else:
|
||
gt_boxes = np.zeros((0, 7))
|
||
gt_names = []
|
||
gt_velocity = np.zeros((0, 2))
|
||
|
||
# 4. 构建info字典
|
||
info = {
|
||
'token': sample_id,
|
||
'timestamp': self._get_timestamp(sample_id),
|
||
'lidar_path': str(lidar_path.relative_to(self.output_root)),
|
||
'sweeps': [], # 如果有多帧sweep,在这里添加
|
||
'cams': cam_data,
|
||
|
||
# LiDAR位姿
|
||
'lidar2ego_translation': self.calib['lidar']['translation'],
|
||
'lidar2ego_rotation': self.calib['lidar']['rotation'],
|
||
'ego2global_translation': [0, 0, 0], # 如果有GPS数据
|
||
'ego2global_rotation': [1, 0, 0, 0],
|
||
|
||
# 标注
|
||
'gt_boxes': gt_boxes,
|
||
'gt_names': gt_names,
|
||
'gt_velocity': gt_velocity,
|
||
'num_lidar_pts': np.array([0] * len(gt_boxes)), # 需要统计
|
||
'valid_flag': np.array([True] * len(gt_boxes)),
|
||
}
|
||
|
||
return info
|
||
|
||
def _build_transform_matrix(self, translation, rotation):
|
||
"""构建4x4变换矩阵"""
|
||
T = np.eye(4)
|
||
T[:3, :3] = Quaternion(rotation).rotation_matrix
|
||
T[:3, 3] = translation
|
||
return T
|
||
|
||
def _parse_annotations(self, annos):
|
||
"""解析标注数据"""
|
||
gt_boxes = []
|
||
gt_names = []
|
||
gt_velocity = []
|
||
|
||
for anno in annos['annotations']:
|
||
# nuScenes格式: [x, y, z, w, l, h, yaw]
|
||
box = [
|
||
anno['translation'][0],
|
||
anno['translation'][1],
|
||
anno['translation'][2],
|
||
anno['size'][0], # width
|
||
anno['size'][1], # length
|
||
anno['size'][2], # height
|
||
self._quaternion_to_yaw(anno['rotation'])
|
||
]
|
||
gt_boxes.append(box)
|
||
gt_names.append(anno['category_name'])
|
||
gt_velocity.append(anno.get('velocity', [0, 0]))
|
||
|
||
return (
|
||
np.array(gt_boxes),
|
||
np.array(gt_names),
|
||
np.array(gt_velocity)
|
||
)
|
||
|
||
def _quaternion_to_yaw(self, q):
|
||
"""四元数转偏航角"""
|
||
return Quaternion(q).yaw_pitch_roll[0]
|
||
|
||
def _get_timestamp(self, sample_id):
|
||
"""获取时间戳 (微秒)"""
|
||
# 从文件名或元数据中提取
|
||
# 示例: sample_id = "2024_01_15_10_30_25_123456"
|
||
# 返回Unix时间戳(微秒)
|
||
return int(sample_id.split('_')[-1])
|
||
|
||
def convert_all(self, split='train'):
|
||
"""转换所有数据"""
|
||
# 获取样本列表
|
||
sample_list_file = self.custom_root / f'{split}_samples.txt'
|
||
with open(sample_list_file, 'r') as f:
|
||
sample_ids = [line.strip() for line in f]
|
||
|
||
infos = []
|
||
for sample_id in tqdm(sample_ids, desc=f"Converting {split} set"):
|
||
info = self.convert_sample(sample_id)
|
||
infos.append(info)
|
||
|
||
# 保存为pkl
|
||
output_file = self.output_root / f'custom_infos_{split}.pkl'
|
||
data = {
|
||
'infos': infos,
|
||
'metadata': {
|
||
'version': 'custom-v1.0',
|
||
'num_cameras': len(self.camera_names),
|
||
'camera_names': self.camera_names
|
||
}
|
||
}
|
||
|
||
with open(output_file, 'wb') as f:
|
||
pickle.dump(data, f)
|
||
|
||
print(f"✅ 已转换 {len(infos)} 个样本到 {output_file}")
|
||
return output_file
|
||
|
||
|
||
# ========== 使用示例 ==========
|
||
if __name__ == '__main__':
|
||
converter = CustomDataConverter(
|
||
custom_data_root='/data/custom_vehicle_data',
|
||
output_root='/data/custom_nuscenes_format',
|
||
camera_names=['CAM_FRONT', 'CAM_LEFT', 'CAM_RIGHT', 'CAM_BACK']
|
||
)
|
||
|
||
# 转换训练集和验证集
|
||
converter.convert_all(split='train')
|
||
converter.convert_all(split='val')
|
||
```
|
||
|
||
#### 5.1.2 标定验证工具
|
||
|
||
```python
|
||
#!/usr/bin/env python3
|
||
# tools/verify_calibration.py
|
||
|
||
import numpy as np
|
||
import cv2
|
||
import open3d as o3d
|
||
from pathlib import Path
|
||
|
||
def verify_camera_lidar_calibration(
|
||
lidar_file: str,
|
||
image_file: str,
|
||
calib_file: str,
|
||
output_file: str = None
|
||
):
|
||
"""
|
||
验证相机-LiDAR外参标定精度
|
||
将LiDAR点投影到图像上,目视检查对齐情况
|
||
"""
|
||
|
||
# 1. 加载点云
|
||
pcd = o3d.io.read_point_cloud(lidar_file)
|
||
points = np.asarray(pcd.points) # (N, 3)
|
||
|
||
# 2. 加载图像
|
||
img = cv2.imread(image_file)
|
||
h, w = img.shape[:2]
|
||
|
||
# 3. 加载标定参数
|
||
import json
|
||
with open(calib_file, 'r') as f:
|
||
calib = json.load(f)
|
||
|
||
intrinsic = np.array(calib['intrinsic']) # (3, 3)
|
||
lidar2cam = np.array(calib['lidar2cam']) # (4, 4)
|
||
|
||
# 4. 投影到相机坐标系
|
||
points_homo = np.hstack([points, np.ones((len(points), 1))]) # (N, 4)
|
||
points_cam = points_homo @ lidar2cam.T # (N, 4)
|
||
points_cam = points_cam[:, :3] # (N, 3)
|
||
|
||
# 5. 过滤相机后方的点
|
||
mask = points_cam[:, 2] > 0
|
||
points_cam = points_cam[mask]
|
||
|
||
# 6. 投影到图像平面
|
||
points_img = points_cam @ intrinsic.T # (N, 3)
|
||
points_img = points_img[:, :2] / points_img[:, 2:] # (N, 2)
|
||
|
||
# 7. 过滤图像外的点
|
||
mask = (
|
||
(points_img[:, 0] >= 0) & (points_img[:, 0] < w) &
|
||
(points_img[:, 1] >= 0) & (points_img[:, 1] < h)
|
||
)
|
||
points_img = points_img[mask]
|
||
depths = points_cam[mask, 2]
|
||
|
||
# 8. 可视化
|
||
img_with_points = img.copy()
|
||
|
||
# 根据深度着色
|
||
depths_norm = (depths - depths.min()) / (depths.max() - depths.min())
|
||
colors = cv2.applyColorMap(
|
||
(depths_norm * 255).astype(np.uint8),
|
||
cv2.COLORMAP_JET
|
||
)
|
||
|
||
for i, (u, v) in enumerate(points_img.astype(int)):
|
||
color = tuple(map(int, colors[i][0]))
|
||
cv2.circle(img_with_points, (u, v), 2, color, -1)
|
||
|
||
# 9. 保存结果
|
||
if output_file:
|
||
cv2.imwrite(output_file, img_with_points)
|
||
print(f"✅ 验证结果保存至: {output_file}")
|
||
|
||
# 10. 计算重投影统计
|
||
print(f"\n标定验证统计:")
|
||
print(f" 投影点数: {len(points_img)}")
|
||
print(f" 深度范围: {depths.min():.2f}m - {depths.max():.2f}m")
|
||
print(f" 图像尺寸: {w}x{h}")
|
||
|
||
return img_with_points
|
||
|
||
|
||
# ========== 批量验证 ==========
|
||
def batch_verify_calibration(data_root: str, num_samples: int = 10):
|
||
"""批量验证多个样本的标定"""
|
||
|
||
data_path = Path(data_root)
|
||
calib_file = data_path / 'calibration' / 'CAM_FRONT.json'
|
||
|
||
# 随机选择样本
|
||
import random
|
||
sample_ids = list((data_path / 'lidar').glob('*.pcd'))
|
||
sample_ids = random.sample(sample_ids, min(num_samples, len(sample_ids)))
|
||
|
||
output_dir = data_path / 'calibration_verification'
|
||
output_dir.mkdir(exist_ok=True)
|
||
|
||
for lidar_file in sample_ids:
|
||
sample_id = lidar_file.stem
|
||
image_file = data_path / 'camera' / 'CAM_FRONT' / f'{sample_id}.jpg'
|
||
output_file = output_dir / f'{sample_id}_verification.jpg'
|
||
|
||
verify_camera_lidar_calibration(
|
||
str(lidar_file),
|
||
str(image_file),
|
||
str(calib_file),
|
||
str(output_file)
|
||
)
|
||
|
||
print(f"\n✅ 已验证 {num_samples} 个样本")
|
||
print(f" 结果保存在: {output_dir}")
|
||
|
||
|
||
if __name__ == '__main__':
|
||
import argparse
|
||
parser = argparse.ArgumentParser()
|
||
parser.add_argument('--data_root', required=True)
|
||
parser.add_argument('--num_samples', type=int, default=10)
|
||
args = parser.parse_args()
|
||
|
||
batch_verify_calibration(args.data_root, args.num_samples)
|
||
```
|
||
|
||
#### 5.1.3 BEV分割标注工具
|
||
|
||
```python
|
||
#!/usr/bin/env python3
|
||
# tools/annotate_bev_segmentation.py
|
||
|
||
import numpy as np
|
||
import cv2
|
||
from pathlib import Path
|
||
import json
|
||
|
||
class BEVSegmentationAnnotator:
|
||
"""BEV语义分割标注工具"""
|
||
|
||
def __init__(
|
||
self,
|
||
xbound=(-50, 50, 0.5),
|
||
ybound=(-50, 50, 0.5),
|
||
classes=None
|
||
):
|
||
self.xbound = xbound
|
||
self.ybound = ybound
|
||
self.classes = classes or [
|
||
'drivable_area',
|
||
'ped_crossing',
|
||
'walkway',
|
||
'stop_line',
|
||
'carpark_area',
|
||
'divider'
|
||
]
|
||
|
||
# 计算BEV网格大小
|
||
self.bev_h = int((ybound[1] - ybound[0]) / ybound[2]) # 200
|
||
self.bev_w = int((xbound[1] - xbound[0]) / xbound[2]) # 200
|
||
|
||
def render_bev_from_lidar(self, lidar_file: str, output_file: str):
|
||
"""从LiDAR点云渲染BEV俯视图"""
|
||
import open3d as o3d
|
||
|
||
# 加载点云
|
||
pcd = o3d.io.read_point_cloud(lidar_file)
|
||
points = np.asarray(pcd.points)
|
||
|
||
# 创建BEV图像
|
||
bev_img = np.zeros((self.bev_h, self.bev_w, 3), dtype=np.uint8)
|
||
|
||
# 将点投影到BEV网格
|
||
for point in points:
|
||
x, y, z = point
|
||
|
||
# 过滤范围外的点
|
||
if not (self.xbound[0] <= x < self.xbound[1] and
|
||
self.ybound[0] <= y < self.ybound[1]):
|
||
continue
|
||
|
||
# 转换为网格坐标
|
||
i = int((y - self.ybound[0]) / self.ybound[2])
|
||
j = int((x - self.xbound[0]) / self.xbound[2])
|
||
|
||
# 根据高度着色
|
||
height_norm = np.clip((z + 2) / 4, 0, 1) # -2m到2m
|
||
color = int(height_norm * 255)
|
||
bev_img[i, j] = [color, color, color]
|
||
|
||
# 翻转Y轴 (图像坐标系)
|
||
bev_img = np.flip(bev_img, axis=0).copy()
|
||
|
||
# 保存
|
||
cv2.imwrite(output_file, bev_img)
|
||
print(f"✅ BEV图像保存至: {output_file}")
|
||
|
||
return bev_img
|
||
|
||
def polygon_to_mask(
|
||
self,
|
||
polygon_json: str,
|
||
class_name: str,
|
||
output_file: str = None
|
||
):
|
||
"""将labelme标注的多边形转换为mask"""
|
||
|
||
# 加载labelme的JSON
|
||
with open(polygon_json, 'r') as f:
|
||
data = json.load(f)
|
||
|
||
# 创建mask
|
||
mask = np.zeros((self.bev_h, self.bev_w), dtype=np.uint8)
|
||
|
||
# 绘制多边形
|
||
for shape in data['shapes']:
|
||
if shape['label'] == class_name:
|
||
points = np.array(shape['points'], dtype=np.int32)
|
||
cv2.fillPoly(mask, [points], 1)
|
||
|
||
if output_file:
|
||
np.save(output_file, mask)
|
||
print(f"✅ Mask保存至: {output_file}")
|
||
|
||
return mask
|
||
|
||
def merge_all_classes(self, polygon_dir: str, output_file: str):
|
||
"""合并所有类别的mask"""
|
||
|
||
polygon_dir = Path(polygon_dir)
|
||
masks = np.zeros((len(self.classes), self.bev_h, self.bev_w), dtype=np.uint8)
|
||
|
||
for i, class_name in enumerate(self.classes):
|
||
json_file = polygon_dir / f'{class_name}.json'
|
||
if json_file.exists():
|
||
mask = self.polygon_to_mask(str(json_file), class_name)
|
||
masks[i] = mask
|
||
else:
|
||
print(f"⚠️ 未找到类别 {class_name} 的标注")
|
||
|
||
# 保存
|
||
np.save(output_file, masks)
|
||
print(f"✅ 完整BEV mask保存至: {output_file}")
|
||
print(f" 形状: {masks.shape}")
|
||
|
||
return masks
|
||
|
||
|
||
# ========== 使用流程 ==========
|
||
if __name__ == '__main__':
|
||
annotator = BEVSegmentationAnnotator()
|
||
|
||
# 步骤1: 渲染BEV图像
|
||
bev_img = annotator.render_bev_from_lidar(
|
||
'data/lidar/sample_001.pcd',
|
||
'data/bev_images/sample_001.png'
|
||
)
|
||
|
||
# 步骤2: 使用labelme标注多边形
|
||
# labelme data/bev_images/sample_001.png
|
||
# 为每个类别绘制多边形,保存为JSON
|
||
|
||
# 步骤3: 转换为mask
|
||
masks = annotator.merge_all_classes(
|
||
polygon_dir='data/bev_annotations/sample_001',
|
||
output_file='data/bev_masks/sample_001.npy'
|
||
)
|
||
|
||
print("✅ BEV分割标注完成!")
|
||
```
|
||
|
||
---
|
||
|
||
## 6. 实车部署完整流程
|
||
|
||
### 6.1 数据采集
|
||
|
||
```bash
|
||
# 1. 相机数据采集
|
||
roslaunch vehicle_perception camera_record.launch \
|
||
cameras:="front,left,right,back" \
|
||
fps:=10 \
|
||
output:=/data/raw/camera
|
||
|
||
# 2. LiDAR数据采集
|
||
roslaunch vehicle_perception lidar_record.launch \
|
||
topic:=/velodyne_points \
|
||
output:=/data/raw/lidar
|
||
|
||
# 3. 同步检查
|
||
python tools/check_sync.py \
|
||
--camera_dir /data/raw/camera \
|
||
--lidar_dir /data/raw/lidar \
|
||
--max_time_diff 0.05 # 50ms
|
||
```
|
||
|
||
### 6.2 标定
|
||
|
||
```bash
|
||
# 相机内参标定
|
||
python tools/calibrate_camera.py \
|
||
--images /data/calibration/checkerboard \
|
||
--pattern 9x6 \
|
||
--square_size 0.05 # 5cm
|
||
|
||
# LiDAR-相机外参标定
|
||
python tools/calibrate_lidar_camera.py \
|
||
--lidar /data/calibration/lidar.pcd \
|
||
--image /data/calibration/cam_front.jpg \
|
||
--target_type checkerboard \
|
||
--output calib/lidar2cam.json
|
||
|
||
# 验证标定
|
||
python tools/verify_calibration.py \
|
||
--data_root /data/raw \
|
||
--num_samples 10
|
||
```
|
||
|
||
### 6.3 数据转换
|
||
|
||
```bash
|
||
# 转换为nuScenes格式
|
||
python tools/data_converter/custom_to_nuscenes.py \
|
||
--custom_root /data/raw \
|
||
--output /data/nuscenes_format \
|
||
--camera_names CAM_FRONT CAM_LEFT CAM_RIGHT CAM_BACK
|
||
|
||
# 验证转换结果
|
||
python tools/check_data_integrity.py \
|
||
--data_root /data/nuscenes_format \
|
||
--check_images \
|
||
--check_lidar \
|
||
--check_annotations
|
||
```
|
||
|
||
### 6.4 标注
|
||
|
||
```bash
|
||
# 启动CVAT标注服务
|
||
cd cvat
|
||
docker-compose up -d
|
||
|
||
# 导入数据
|
||
python tools/import_to_cvat.py \
|
||
--data_root /data/nuscenes_format \
|
||
--task_name "vehicle_annotation_batch_001"
|
||
|
||
# 标注完成后导出
|
||
python tools/export_from_cvat.py \
|
||
--task_id 123 \
|
||
--output /data/annotations/batch_001.json
|
||
|
||
# 转换为训练格式
|
||
python tools/cvat_to_nuscenes_anno.py \
|
||
--cvat_json /data/annotations/batch_001.json \
|
||
--output /data/nuscenes_format/annotations
|
||
```
|
||
|
||
### 6.5 训练配置调整
|
||
|
||
```yaml
|
||
# configs/custom_vehicle/custom_4cam.yaml
|
||
|
||
# 继承基础配置
|
||
_base_:
|
||
- ../nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/multitask_enhanced_phase1_HIGHRES.yaml
|
||
|
||
# 修改相机数量
|
||
model:
|
||
encoders:
|
||
camera:
|
||
backbone:
|
||
# 使用预训练权重
|
||
init_cfg:
|
||
checkpoint: runs/enhanced_from_epoch19/epoch_10.pth
|
||
vtransform:
|
||
# 4相机配置
|
||
num_cameras: 4
|
||
|
||
# 数据配置
|
||
data:
|
||
train:
|
||
type: NuScenesDataset
|
||
dataset_root: /data/nuscenes_format
|
||
ann_file: /data/nuscenes_format/custom_infos_train.pkl
|
||
pipeline: ${train_pipeline}
|
||
|
||
val:
|
||
type: NuScenesDataset
|
||
dataset_root: /data/nuscenes_format
|
||
ann_file: /data/nuscenes_format/custom_infos_val.pkl
|
||
pipeline: ${val_pipeline}
|
||
|
||
# 训练参数 (迁移学习)
|
||
train:
|
||
max_epochs: 20 # 微调20个epoch
|
||
lr:
|
||
warmup_iters: 500
|
||
max_lr: 0.0001 # 降低学习率
|
||
```
|
||
|
||
### 6.6 微调训练
|
||
|
||
```bash
|
||
# 启动训练
|
||
torchpack dist-run -np 6 python tools/train.py \
|
||
configs/custom_vehicle/custom_4cam.yaml \
|
||
--model.encoders.camera.backbone.init_cfg.checkpoint=runs/enhanced_from_epoch19/epoch_10.pth \
|
||
--run-dir runs/custom_vehicle_finetune
|
||
|
||
# 监控训练
|
||
tensorboard --logdir runs/custom_vehicle_finetune
|
||
```
|
||
|
||
---
|
||
|
||
## 7. 常见问题与解决方案
|
||
|
||
### Q1: 相机数量不同如何处理?
|
||
|
||
```python
|
||
# 方案1: 修改模型输入层
|
||
# mmdet3d/models/encoders/camera_encoder.py
|
||
|
||
class CameraEncoder:
|
||
def __init__(self, num_cameras=6, ...): # 改为4
|
||
self.num_cameras = num_cameras
|
||
# 模型自动适配
|
||
|
||
# 方案2: 使用虚拟相机填充
|
||
# 如果模型期望6相机,实际只有4相机
|
||
# 可以复制前相机或使用黑图填充
|
||
```
|
||
|
||
### Q2: 分辨率不同如何处理?
|
||
|
||
```yaml
|
||
# 配置中调整
|
||
model:
|
||
encoders:
|
||
camera:
|
||
vtransform:
|
||
image_size: [384, 1024] # 从[256, 704]改为实际分辨率
|
||
feature_size: [48, 128] # 相应调整
|
||
```
|
||
|
||
### Q3: FOV差异如何处理?
|
||
|
||
```python
|
||
# 需要重新标定,获取准确内参
|
||
# 如果FOV差异>30度,建议重新训练
|
||
# 如果<30度,可以通过fine-tuning适配
|
||
```
|
||
|
||
### Q4: 没有BEV分割标注怎么办?
|
||
|
||
```python
|
||
# 方案1: 仅训练3D检测,关闭分割
|
||
model:
|
||
heads:
|
||
object: ... # 保留
|
||
map: null # 关闭
|
||
|
||
# 方案2: 使用预训练的分割头冻结
|
||
model:
|
||
heads:
|
||
map:
|
||
freeze: true # 冻结分割头权重
|
||
```
|
||
|
||
---
|
||
|
||
## 8. 性能优化建议
|
||
|
||
### 8.1 数据增强
|
||
|
||
```yaml
|
||
# 针对实车场景调整增强策略
|
||
augment2d:
|
||
resize: [[0.4, 0.6], [0.5, 0.5]] # 更大的resize范围
|
||
rotate: [-10, 10] # 更大的旋转
|
||
brightness: 0.2 # 亮度增强
|
||
contrast: 0.2 # 对比度增强
|
||
```
|
||
|
||
### 8.2 困难样本挖掘
|
||
|
||
```python
|
||
# 工具: 识别模型表现差的场景
|
||
python tools/find_hard_samples.py \
|
||
--checkpoint runs/custom_vehicle_finetune/epoch_10.pth \
|
||
--data_root /data/nuscenes_format \
|
||
--output hard_samples.txt
|
||
|
||
# 针对性增加这些场景的标注
|
||
```
|
||
|
||
### 8.3 评估指标
|
||
|
||
```bash
|
||
# 在实车数据上评估
|
||
python tools/test.py \
|
||
configs/custom_vehicle/custom_4cam.yaml \
|
||
runs/custom_vehicle_finetune/epoch_10.pth \
|
||
--eval bbox segm
|
||
|
||
# 分析性能
|
||
python tools/analyze_results.py \
|
||
--results results.pkl \
|
||
--by_scene \
|
||
--by_distance \
|
||
--by_weather
|
||
```
|
||
|
||
---
|
||
|
||
## 9. 总结
|
||
|
||
### 关键步骤检查清单
|
||
|
||
```
|
||
□ 数据采集
|
||
□ 相机数据完整 (4-6路)
|
||
□ LiDAR数据完整
|
||
□ 时间戳同步 (<50ms)
|
||
□ 场景覆盖充分
|
||
|
||
□ 传感器标定
|
||
□ 相机内参精度 (重投影误差<1px)
|
||
□ LiDAR-相机外参精度 (旋转<0.5°, 平移<2cm)
|
||
□ 标定验证通过
|
||
|
||
□ 数据转换
|
||
□ 转换为nuScenes格式
|
||
□ 数据完整性检查
|
||
□ 可视化验证
|
||
|
||
□ 标注
|
||
□ 3D box标注 (IoU>0.7)
|
||
□ BEV分割标注 (可选)
|
||
□ 质量审核通过
|
||
□ 数据集规模 (>5000帧)
|
||
|
||
□ 训练
|
||
□ 配置文件调整
|
||
□ 迁移学习/微调
|
||
□ 性能验证
|
||
□ 实车测试
|
||
```
|
||
|
||
### 最小可行部署
|
||
|
||
```
|
||
最小数据集: 5000帧
|
||
最小标注: 仅3D检测
|
||
训练时间: 2-3天 (6xV100)
|
||
性能预期: NDS 0.55-0.65
|
||
```
|
||
|
||
### 完整部署
|
||
|
||
```
|
||
推荐数据集: 10000-15000帧
|
||
完整标注: 3D检测 + BEV分割
|
||
训练时间: 5-7天
|
||
性能预期: NDS 0.65-0.75
|
||
```
|
||
|
||
---
|
||
|
||
**文档版本**: v1.0
|
||
**最后更新**: 2025-10-25
|
||
**适用于**: BEVFusion + 自定义传感器配置
|
||
|