26 KiB

Raw Blame History

nuScenes数据格式与实车部署标注指南

📋 目录

nuScenes GT数据格式详解
BEVFusion输入输出格式
实车相机配置变动方案
数据采集与标注流程
工具与代码示例

1. nuScenes GT数据格式详解

1.1 数据组织结构

nuScenes数据集按照以下层级组织:

nuscenes/
├── v1.0-trainval/          # 元数据（JSON格式）
│   ├── sample.json         # 关键帧样本
│   ├── sample_data.json    # 传感器数据
│   ├── scene.json          # 场景信息
│   ├── calibrated_sensor.json  # 传感器标定
│   ├── ego_pose.json       # 车辆位姿
│   ├── instance.json       # 目标实例
│   ├── category.json       # 目标类别
│   └── attribute.json      # 目标属性
├── samples/                # 关键帧数据
│   ├── CAM_FRONT/
│   ├── CAM_FRONT_LEFT/
│   ├── CAM_FRONT_RIGHT/
│   ├── CAM_BACK/
│   ├── CAM_BACK_LEFT/
│   ├── CAM_BACK_RIGHT/
│   ├── LIDAR_TOP/
│   └── RADAR_FRONT/
├── sweeps/                 # 中间帧数据
└── maps/                   # HD地图数据
    ├── expansion/
    └── basemap/

1.2 单个Sample的数据结构

BEVFusion使用的处理后的info格式 (pkl文件):

{
    'token': 'ca9a282c9e77460f8360f564131a8af5',  # 样本唯一ID
    'timestamp': 1532402927647951,                  # 时间戳(微秒)
    'scene_token': 'scene-0001',                    # 场景ID
    
    # ========== LiDAR数据 ==========
    'lidar_path': 'samples/LIDAR_TOP/n015-2018-07-24-11-22-45+0800__LIDAR_TOP__1532402927647951.pcd.bin',
    'sweeps': [                                     # 多帧LiDAR数据
        {
            'data_path': 'sweeps/LIDAR_TOP/...',
            'sensor2lidar_rotation': [[...], [...], [...]],  # 3x3旋转矩阵
            'sensor2lidar_translation': [x, y, z],
            'timestamp': 1532402927547951
        },
        # ... 共9帧sweep
    ],
    
    # ========== LiDAR位姿 ==========
    'lidar2ego_translation': [x, y, z],             # LiDAR到车体坐标系
    'lidar2ego_rotation': [w, x, y, z],             # 四元数
    'ego2global_translation': [x, y, z],            # 车体到全局坐标系
    'ego2global_rotation': [w, x, y, z],
    
    # ========== 相机数据 (6个相机) ==========
    'cams': {
        'CAM_FRONT': {
            'data_path': 'samples/CAM_FRONT/...',
            'cam_intrinsic': [                      # 3x3内参矩阵
                [fx, 0,  cx],
                [0,  fy, cy],
                [0,  0,  1]
            ],
            'sensor2ego_translation': [x, y, z],    # 相机到车体
            'sensor2ego_rotation': [w, x, y, z],
            'sensor2lidar_rotation': [[...], [...], [...]],  # 相机到LiDAR
            'sensor2lidar_translation': [x, y, z],
            'timestamp': 1532402927612460
        },
        'CAM_FRONT_RIGHT': { ... },
        'CAM_FRONT_LEFT': { ... },
        'CAM_BACK': { ... },
        'CAM_BACK_LEFT': { ... },
        'CAM_BACK_RIGHT': { ... }
    },
    
    # ========== 3D目标检测标注 ==========
    'gt_boxes': [                                   # N x 9数组
        # [x, y, z, w, l, h, yaw, vx, vy]
        # x,y,z: 中心点(全局坐标)
        # w,l,h: 宽度,长度,高度(米)
        # yaw: 偏航角(弧度)
        # vx,vy: 速度(m/s)
    ],
    'gt_names': ['car', 'pedestrian', ...],         # 类别名称
    'gt_velocity': [[vx, vy], ...],                 # 速度(m/s)
    'num_lidar_pts': [34, 12, ...],                 # 每个box内的点数
    'num_radar_pts': [5, 2, ...],                   # 雷达点数
    'valid_flag': [True, True, ...],                # 是否有效
    
    # ========== 目标类别(10类) ==========
    # car, truck, construction_vehicle, bus, trailer,
    # barrier, motorcycle, bicycle, pedestrian, traffic_cone
}

1.3 BEV语义分割标注

BEV分割标注从nuScenes地图数据动态生成:

# 在LoadBEVSegmentation中实时生成
gt_masks_bev: np.ndarray              # (num_classes, H, W)
    # H=200, W=200 (对应100m x 100m, 分辨率0.5m)
    # num_classes=6:
    #   [0] drivable_area   - 可行驶区域
    #   [1] ped_crossing    - 人行横道
    #   [2] walkway         - 人行道
    #   [3] stop_line       - 停止线
    #   [4] carpark_area    - 停车区
    #   [5] divider         - 分隔线
    # 值: 0/1 (二值mask)

生成过程:

从nuScenes Map API获取当前位置的地图数据
根据LiDAR位姿确定BEV范围 (默认50m x 50m)
渲染6个语义类别到200x200网格
进行坐标变换和旋转对齐

5. 工具与代码示例

5.1 数据转换工具集

5.1.1 自定义数据转nuScenes格式

#!/usr/bin/env python3
# tools/data_converter/custom_to_nuscenes.py

import os
import pickle
import numpy as np
from tqdm import tqdm
from pyquaternion import Quaternion
from pathlib import Path
import json

class CustomDataConverter:
    """将实车采集数据转换为nuScenes格式"""
    
    def __init__(
        self,
        custom_data_root: str,
        output_root: str,
        camera_names: list = None,
    ):
        self.custom_root = Path(custom_data_root)
        self.output_root = Path(output_root)
        
        # 默认相机配置
        self.camera_names = camera_names or [
            'CAM_FRONT',
            'CAM_FRONT_LEFT', 
            'CAM_FRONT_RIGHT',
            'CAM_BACK'
        ]
        
        # 加载标定数据
        self.calib = self.load_calibration()
        
    def load_calibration(self):
        """加载相机和LiDAR标定参数"""
        calib_file = self.custom_root / 'calibration' / 'sensors.json'
        with open(calib_file, 'r') as f:
            calib = json.load(f)
        return calib
    
    def convert_sample(self, sample_id: str):
        """转换单个样本"""
        
        # 1. 加载LiDAR数据
        lidar_path = self.custom_root / 'lidar' / f'{sample_id}.pcd'
        
        # 2. 加载相机图像
        cam_data = {}
        for cam_name in self.camera_names:
            img_path = self.custom_root / 'camera' / cam_name / f'{sample_id}.jpg'
            
            # 获取该相机的标定参数
            cam_calib = self.calib['cameras'][cam_name]
            
            # 计算变换矩阵
            cam2ego = self._build_transform_matrix(
                cam_calib['translation'],
                cam_calib['rotation']
            )
            
            ego2lidar_inv = self._build_transform_matrix(
                self.calib['lidar']['translation'],
                self.calib['lidar']['rotation']
            )
            ego2lidar = np.linalg.inv(ego2lidar_inv)
            
            cam2lidar = ego2lidar @ cam2ego
            
            cam_data[cam_name] = {
                'data_path': str(img_path.relative_to(self.output_root)),
                'cam_intrinsic': np.array(cam_calib['intrinsic']),
                'sensor2ego_translation': cam_calib['translation'],
                'sensor2ego_rotation': cam_calib['rotation'],
                'sensor2lidar_rotation': cam2lidar[:3, :3],
                'sensor2lidar_translation': cam2lidar[:3, 3],
                'timestamp': self._get_timestamp(sample_id)
            }
        
        # 3. 加载标注 (如果存在)
        anno_file = self.custom_root / 'annotations' / f'{sample_id}.json'
        if anno_file.exists():
            with open(anno_file, 'r') as f:
                annos = json.load(f)
            
            gt_boxes, gt_names, gt_velocity = self._parse_annotations(annos)
        else:
            gt_boxes = np.zeros((0, 7))
            gt_names = []
            gt_velocity = np.zeros((0, 2))
        
        # 4. 构建info字典
        info = {
            'token': sample_id,
            'timestamp': self._get_timestamp(sample_id),
            'lidar_path': str(lidar_path.relative_to(self.output_root)),
            'sweeps': [],  # 如果有多帧sweep，在这里添加
            'cams': cam_data,
            
            # LiDAR位姿
            'lidar2ego_translation': self.calib['lidar']['translation'],
            'lidar2ego_rotation': self.calib['lidar']['rotation'],
            'ego2global_translation': [0, 0, 0],  # 如果有GPS数据
            'ego2global_rotation': [1, 0, 0, 0],
            
            # 标注
            'gt_boxes': gt_boxes,
            'gt_names': gt_names,
            'gt_velocity': gt_velocity,
            'num_lidar_pts': np.array([0] * len(gt_boxes)),  # 需要统计
            'valid_flag': np.array([True] * len(gt_boxes)),
        }
        
        return info
    
    def _build_transform_matrix(self, translation, rotation):
        """构建4x4变换矩阵"""
        T = np.eye(4)
        T[:3, :3] = Quaternion(rotation).rotation_matrix
        T[:3, 3] = translation
        return T
    
    def _parse_annotations(self, annos):
        """解析标注数据"""
        gt_boxes = []
        gt_names = []
        gt_velocity = []
        
        for anno in annos['annotations']:
            # nuScenes格式: [x, y, z, w, l, h, yaw]
            box = [
                anno['translation'][0],
                anno['translation'][1],
                anno['translation'][2],
                anno['size'][0],  # width
                anno['size'][1],  # length
                anno['size'][2],  # height
                self._quaternion_to_yaw(anno['rotation'])
            ]
            gt_boxes.append(box)
            gt_names.append(anno['category_name'])
            gt_velocity.append(anno.get('velocity', [0, 0]))
        
        return (
            np.array(gt_boxes),
            np.array(gt_names),
            np.array(gt_velocity)
        )
    
    def _quaternion_to_yaw(self, q):
        """四元数转偏航角"""
        return Quaternion(q).yaw_pitch_roll[0]
    
    def _get_timestamp(self, sample_id):
        """获取时间戳 (微秒)"""
        # 从文件名或元数据中提取
        # 示例: sample_id = "2024_01_15_10_30_25_123456"
        # 返回Unix时间戳(微秒)
        return int(sample_id.split('_')[-1])
    
    def convert_all(self, split='train'):
        """转换所有数据"""
        # 获取样本列表
        sample_list_file = self.custom_root / f'{split}_samples.txt'
        with open(sample_list_file, 'r') as f:
            sample_ids = [line.strip() for line in f]
        
        infos = []
        for sample_id in tqdm(sample_ids, desc=f"Converting {split} set"):
            info = self.convert_sample(sample_id)
            infos.append(info)
        
        # 保存为pkl
        output_file = self.output_root / f'custom_infos_{split}.pkl'
        data = {
            'infos': infos,
            'metadata': {
                'version': 'custom-v1.0',
                'num_cameras': len(self.camera_names),
                'camera_names': self.camera_names
            }
        }
        
        with open(output_file, 'wb') as f:
            pickle.dump(data, f)
        
        print(f"✅ 已转换 {len(infos)} 个样本到 {output_file}")
        return output_file


# ========== 使用示例 ==========
if __name__ == '__main__':
    converter = CustomDataConverter(
        custom_data_root='/data/custom_vehicle_data',
        output_root='/data/custom_nuscenes_format',
        camera_names=['CAM_FRONT', 'CAM_LEFT', 'CAM_RIGHT', 'CAM_BACK']
    )
    
    # 转换训练集和验证集
    converter.convert_all(split='train')
    converter.convert_all(split='val')

5.1.2 标定验证工具

#!/usr/bin/env python3
# tools/verify_calibration.py

import numpy as np
import cv2
import open3d as o3d
from pathlib import Path

def verify_camera_lidar_calibration(
    lidar_file: str,
    image_file: str,
    calib_file: str,
    output_file: str = None
):
    """
    验证相机-LiDAR外参标定精度
    将LiDAR点投影到图像上，目视检查对齐情况
    """
    
    # 1. 加载点云
    pcd = o3d.io.read_point_cloud(lidar_file)
    points = np.asarray(pcd.points)  # (N, 3)
    
    # 2. 加载图像
    img = cv2.imread(image_file)
    h, w = img.shape[:2]
    
    # 3. 加载标定参数
    import json
    with open(calib_file, 'r') as f:
        calib = json.load(f)
    
    intrinsic = np.array(calib['intrinsic'])  # (3, 3)
    lidar2cam = np.array(calib['lidar2cam'])  # (4, 4)
    
    # 4. 投影到相机坐标系
    points_homo = np.hstack([points, np.ones((len(points), 1))])  # (N, 4)
    points_cam = points_homo @ lidar2cam.T  # (N, 4)
    points_cam = points_cam[:, :3]  # (N, 3)
    
    # 5. 过滤相机后方的点
    mask = points_cam[:, 2] > 0
    points_cam = points_cam[mask]
    
    # 6. 投影到图像平面
    points_img = points_cam @ intrinsic.T  # (N, 3)
    points_img = points_img[:, :2] / points_img[:, 2:]  # (N, 2)
    
    # 7. 过滤图像外的点
    mask = (
        (points_img[:, 0] >= 0) & (points_img[:, 0] < w) &
        (points_img[:, 1] >= 0) & (points_img[:, 1] < h)
    )
    points_img = points_img[mask]
    depths = points_cam[mask, 2]
    
    # 8. 可视化
    img_with_points = img.copy()
    
    # 根据深度着色
    depths_norm = (depths - depths.min()) / (depths.max() - depths.min())
    colors = cv2.applyColorMap(
        (depths_norm * 255).astype(np.uint8),
        cv2.COLORMAP_JET
    )
    
    for i, (u, v) in enumerate(points_img.astype(int)):
        color = tuple(map(int, colors[i][0]))
        cv2.circle(img_with_points, (u, v), 2, color, -1)
    
    # 9. 保存结果
    if output_file:
        cv2.imwrite(output_file, img_with_points)
        print(f"✅ 验证结果保存至: {output_file}")
    
    # 10. 计算重投影统计
    print(f"\n标定验证统计:")
    print(f"  投影点数: {len(points_img)}")
    print(f"  深度范围: {depths.min():.2f}m - {depths.max():.2f}m")
    print(f"  图像尺寸: {w}x{h}")
    
    return img_with_points


# ========== 批量验证 ==========
def batch_verify_calibration(data_root: str, num_samples: int = 10):
    """批量验证多个样本的标定"""
    
    data_path = Path(data_root)
    calib_file = data_path / 'calibration' / 'CAM_FRONT.json'
    
    # 随机选择样本
    import random
    sample_ids = list((data_path / 'lidar').glob('*.pcd'))
    sample_ids = random.sample(sample_ids, min(num_samples, len(sample_ids)))
    
    output_dir = data_path / 'calibration_verification'
    output_dir.mkdir(exist_ok=True)
    
    for lidar_file in sample_ids:
        sample_id = lidar_file.stem
        image_file = data_path / 'camera' / 'CAM_FRONT' / f'{sample_id}.jpg'
        output_file = output_dir / f'{sample_id}_verification.jpg'
        
        verify_camera_lidar_calibration(
            str(lidar_file),
            str(image_file),
            str(calib_file),
            str(output_file)
        )
    
    print(f"\n✅ 已验证 {num_samples} 个样本")
    print(f"   结果保存在: {output_dir}")


if __name__ == '__main__':
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument('--data_root', required=True)
    parser.add_argument('--num_samples', type=int, default=10)
    args = parser.parse_args()
    
    batch_verify_calibration(args.data_root, args.num_samples)

5.1.3 BEV分割标注工具

#!/usr/bin/env python3
# tools/annotate_bev_segmentation.py

import numpy as np
import cv2
from pathlib import Path
import json

class BEVSegmentationAnnotator:
    """BEV语义分割标注工具"""
    
    def __init__(
        self,
        xbound=(-50, 50, 0.5),
        ybound=(-50, 50, 0.5),
        classes=None
    ):
        self.xbound = xbound
        self.ybound = ybound
        self.classes = classes or [
            'drivable_area',
            'ped_crossing',
            'walkway',
            'stop_line',
            'carpark_area',
            'divider'
        ]
        
        # 计算BEV网格大小
        self.bev_h = int((ybound[1] - ybound[0]) / ybound[2])  # 200
        self.bev_w = int((xbound[1] - xbound[0]) / xbound[2])  # 200
        
    def render_bev_from_lidar(self, lidar_file: str, output_file: str):
        """从LiDAR点云渲染BEV俯视图"""
        import open3d as o3d
        
        # 加载点云
        pcd = o3d.io.read_point_cloud(lidar_file)
        points = np.asarray(pcd.points)
        
        # 创建BEV图像
        bev_img = np.zeros((self.bev_h, self.bev_w, 3), dtype=np.uint8)
        
        # 将点投影到BEV网格
        for point in points:
            x, y, z = point
            
            # 过滤范围外的点
            if not (self.xbound[0] <= x < self.xbound[1] and
                    self.ybound[0] <= y < self.ybound[1]):
                continue
            
            # 转换为网格坐标
            i = int((y - self.ybound[0]) / self.ybound[2])
            j = int((x - self.xbound[0]) / self.xbound[2])
            
            # 根据高度着色
            height_norm = np.clip((z + 2) / 4, 0, 1)  # -2m到2m
            color = int(height_norm * 255)
            bev_img[i, j] = [color, color, color]
        
        # 翻转Y轴 (图像坐标系)
        bev_img = np.flip(bev_img, axis=0).copy()
        
        # 保存
        cv2.imwrite(output_file, bev_img)
        print(f"✅ BEV图像保存至: {output_file}")
        
        return bev_img
    
    def polygon_to_mask(
        self,
        polygon_json: str,
        class_name: str,
        output_file: str = None
    ):
        """将labelme标注的多边形转换为mask"""
        
        # 加载labelme的JSON
        with open(polygon_json, 'r') as f:
            data = json.load(f)
        
        # 创建mask
        mask = np.zeros((self.bev_h, self.bev_w), dtype=np.uint8)
        
        # 绘制多边形
        for shape in data['shapes']:
            if shape['label'] == class_name:
                points = np.array(shape['points'], dtype=np.int32)
                cv2.fillPoly(mask, [points], 1)
        
        if output_file:
            np.save(output_file, mask)
            print(f"✅ Mask保存至: {output_file}")
        
        return mask
    
    def merge_all_classes(self, polygon_dir: str, output_file: str):
        """合并所有类别的mask"""
        
        polygon_dir = Path(polygon_dir)
        masks = np.zeros((len(self.classes), self.bev_h, self.bev_w), dtype=np.uint8)
        
        for i, class_name in enumerate(self.classes):
            json_file = polygon_dir / f'{class_name}.json'
            if json_file.exists():
                mask = self.polygon_to_mask(str(json_file), class_name)
                masks[i] = mask
            else:
                print(f"⚠️  未找到类别 {class_name} 的标注")
        
        # 保存
        np.save(output_file, masks)
        print(f"✅ 完整BEV mask保存至: {output_file}")
        print(f"   形状: {masks.shape}")
        
        return masks


# ========== 使用流程 ==========
if __name__ == '__main__':
    annotator = BEVSegmentationAnnotator()
    
    # 步骤1: 渲染BEV图像
    bev_img = annotator.render_bev_from_lidar(
        'data/lidar/sample_001.pcd',
        'data/bev_images/sample_001.png'
    )
    
    # 步骤2: 使用labelme标注多边形
    # labelme data/bev_images/sample_001.png
    # 为每个类别绘制多边形，保存为JSON
    
    # 步骤3: 转换为mask
    masks = annotator.merge_all_classes(
        polygon_dir='data/bev_annotations/sample_001',
        output_file='data/bev_masks/sample_001.npy'
    )
    
    print("✅ BEV分割标注完成!")

6. 实车部署完整流程

6.1 数据采集

# 1. 相机数据采集
roslaunch vehicle_perception camera_record.launch \
    cameras:="front,left,right,back" \
    fps:=10 \
    output:=/data/raw/camera

# 2. LiDAR数据采集  
roslaunch vehicle_perception lidar_record.launch \
    topic:=/velodyne_points \
    output:=/data/raw/lidar

# 3. 同步检查
python tools/check_sync.py \
    --camera_dir /data/raw/camera \
    --lidar_dir /data/raw/lidar \
    --max_time_diff 0.05  # 50ms

6.2 标定

# 相机内参标定
python tools/calibrate_camera.py \
    --images /data/calibration/checkerboard \
    --pattern 9x6 \
    --square_size 0.05  # 5cm

# LiDAR-相机外参标定
python tools/calibrate_lidar_camera.py \
    --lidar /data/calibration/lidar.pcd \
    --image /data/calibration/cam_front.jpg \
    --target_type checkerboard \
    --output calib/lidar2cam.json

# 验证标定
python tools/verify_calibration.py \
    --data_root /data/raw \
    --num_samples 10

6.3 数据转换

# 转换为nuScenes格式
python tools/data_converter/custom_to_nuscenes.py \
    --custom_root /data/raw \
    --output /data/nuscenes_format \
    --camera_names CAM_FRONT CAM_LEFT CAM_RIGHT CAM_BACK

# 验证转换结果
python tools/check_data_integrity.py \
    --data_root /data/nuscenes_format \
    --check_images \
    --check_lidar \
    --check_annotations

6.4 标注

# 启动CVAT标注服务
cd cvat
docker-compose up -d

# 导入数据
python tools/import_to_cvat.py \
    --data_root /data/nuscenes_format \
    --task_name "vehicle_annotation_batch_001"

# 标注完成后导出
python tools/export_from_cvat.py \
    --task_id 123 \
    --output /data/annotations/batch_001.json

# 转换为训练格式
python tools/cvat_to_nuscenes_anno.py \
    --cvat_json /data/annotations/batch_001.json \
    --output /data/nuscenes_format/annotations

6.5 训练配置调整

# configs/custom_vehicle/custom_4cam.yaml

# 继承基础配置
_base_:
  - ../nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/multitask_enhanced_phase1_HIGHRES.yaml

# 修改相机数量
model:
  encoders:
    camera:
      backbone:
        # 使用预训练权重
        init_cfg:
          checkpoint: runs/enhanced_from_epoch19/epoch_10.pth
      vtransform:
        # 4相机配置
        num_cameras: 4

# 数据配置
data:
  train:
    type: NuScenesDataset
    dataset_root: /data/nuscenes_format
    ann_file: /data/nuscenes_format/custom_infos_train.pkl
    pipeline: ${train_pipeline}
  
  val:
    type: NuScenesDataset
    dataset_root: /data/nuscenes_format
    ann_file: /data/nuscenes_format/custom_infos_val.pkl
    pipeline: ${val_pipeline}

# 训练参数 (迁移学习)
train:
  max_epochs: 20  # 微调20个epoch
  lr:
    warmup_iters: 500
    max_lr: 0.0001  # 降低学习率

6.6 微调训练

# 启动训练
torchpack dist-run -np 6 python tools/train.py \
    configs/custom_vehicle/custom_4cam.yaml \
    --model.encoders.camera.backbone.init_cfg.checkpoint=runs/enhanced_from_epoch19/epoch_10.pth \
    --run-dir runs/custom_vehicle_finetune

# 监控训练
tensorboard --logdir runs/custom_vehicle_finetune

7. 常见问题与解决方案

Q1: 相机数量不同如何处理?

# 方案1: 修改模型输入层
# mmdet3d/models/encoders/camera_encoder.py

class CameraEncoder:
    def __init__(self, num_cameras=6, ...):  # 改为4
        self.num_cameras = num_cameras
        # 模型自动适配

# 方案2: 使用虚拟相机填充
# 如果模型期望6相机，实际只有4相机
# 可以复制前相机或使用黑图填充

Q2: 分辨率不同如何处理?

# 配置中调整
model:
  encoders:
    camera:
      vtransform:
        image_size: [384, 1024]  # 从[256, 704]改为实际分辨率
        feature_size: [48, 128]  # 相应调整

Q3: FOV差异如何处理?

# 需要重新标定，获取准确内参
# 如果FOV差异>30度，建议重新训练
# 如果<30度，可以通过fine-tuning适配

Q4: 没有BEV分割标注怎么办?

# 方案1: 仅训练3D检测，关闭分割
model:
  heads:
    object: ...  # 保留
    map: null    # 关闭

# 方案2: 使用预训练的分割头冻结
model:
  heads:
    map:
      freeze: true  # 冻结分割头权重

8. 性能优化建议

8.1 数据增强

# 针对实车场景调整增强策略
augment2d:
  resize: [[0.4, 0.6], [0.5, 0.5]]  # 更大的resize范围
  rotate: [-10, 10]                  # 更大的旋转
  brightness: 0.2                    # 亮度增强
  contrast: 0.2                      # 对比度增强

8.2 困难样本挖掘

# 工具: 识别模型表现差的场景
python tools/find_hard_samples.py \
    --checkpoint runs/custom_vehicle_finetune/epoch_10.pth \
    --data_root /data/nuscenes_format \
    --output hard_samples.txt

# 针对性增加这些场景的标注

8.3 评估指标

# 在实车数据上评估
python tools/test.py \
    configs/custom_vehicle/custom_4cam.yaml \
    runs/custom_vehicle_finetune/epoch_10.pth \
    --eval bbox segm

# 分析性能
python tools/analyze_results.py \
    --results results.pkl \
    --by_scene \
    --by_distance \
    --by_weather

9. 总结

关键步骤检查清单

□ 数据采集
  □ 相机数据完整 (4-6路)
  □ LiDAR数据完整
  □ 时间戳同步 (<50ms)
  □ 场景覆盖充分

□ 传感器标定
  □ 相机内参精度 (重投影误差<1px)
  □ LiDAR-相机外参精度 (旋转<0.5°, 平移<2cm)
  □ 标定验证通过

□ 数据转换
  □ 转换为nuScenes格式
  □ 数据完整性检查
  □ 可视化验证

□ 标注
  □ 3D box标注 (IoU>0.7)
  □ BEV分割标注 (可选)
  □ 质量审核通过
  □ 数据集规模 (>5000帧)

□ 训练
  □ 配置文件调整
  □ 迁移学习/微调
  □ 性能验证
  □ 实车测试

最小可行部署

最小数据集: 5000帧
最小标注: 仅3D检测
训练时间: 2-3天 (6xV100)
性能预期: NDS 0.55-0.65

完整部署

推荐数据集: 10000-15000帧
完整标注: 3D检测 + BEV分割
训练时间: 5-7天
性能预期: NDS 0.65-0.75

文档版本: v1.0
最后更新: 2025-10-25
适用于: BEVFusion + 自定义传感器配置

26 KiB Raw Blame History Unescape Escape

nuScenes数据格式与实车部署标注指南

📋 目录

1. nuScenes GT数据格式详解

1.1 数据组织结构

1.2 单个Sample的数据结构

1.3 BEV语义分割标注

5. 工具与代码示例

5.1 数据转换工具集

5.1.1 自定义数据转nuScenes格式

5.1.2 标定验证工具

5.1.3 BEV分割标注工具

6. 实车部署完整流程

6.1 数据采集

6.2 标定

6.3 数据转换

6.4 标注

6.5 训练配置调整

6.6 微调训练

7. 常见问题与解决方案

Q1: 相机数量不同如何处理?

Q2: 分辨率不同如何处理?

Q3: FOV差异如何处理?

Q4: 没有BEV分割标注怎么办?

8. 性能优化建议

8.1 数据增强

8.2 困难样本挖掘

8.3 评估指标

9. 总结

关键步骤检查清单

最小可行部署

完整部署

26 KiB

Raw Blame History