630 lines
19 KiB
Markdown
630 lines
19 KiB
Markdown
|
|
# MapTR集成BEVFusion实战指南
|
|||
|
|
|
|||
|
|
**基于代码研究**:/workspace/MapTR
|
|||
|
|
**目标**:将MapTR矢量地图预测集成到BEVFusion
|
|||
|
|
**难度**:⭐⭐⭐⭐(中高)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🎯 核心发现
|
|||
|
|
|
|||
|
|
### MapTR代码结构清晰
|
|||
|
|
```
|
|||
|
|
核心代码: 3540行
|
|||
|
|
├── MapTRHead: 35KB (核心!)
|
|||
|
|
├── Losses: 26KB (Chamfer Distance等)
|
|||
|
|
├── Decoder: 3KB (简单Transformer)
|
|||
|
|
├── Assigner: 9KB (Hungarian匹配)
|
|||
|
|
└── Encoder: 12KB (可选,我们用BEVFusion的)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 关键参数配置
|
|||
|
|
```python
|
|||
|
|
MapTRHead主要参数:
|
|||
|
|
num_vec: 20-50 # 预测矢量数量
|
|||
|
|
num_pts_per_vec: 20 # 每矢量点数
|
|||
|
|
num_classes: 3 # 类别数(divider, boundary, crossing)
|
|||
|
|
embed_dims: 256 # 特征维度
|
|||
|
|
num_decoder_layers: 6 # Decoder层数
|
|||
|
|
|
|||
|
|
总Query数 = num_vec × num_pts_per_vec
|
|||
|
|
例如: 50 × 20 = 1000个query
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🔧 简化集成方案(推荐)
|
|||
|
|
|
|||
|
|
### 方案对比
|
|||
|
|
|
|||
|
|
| 方案 | 复杂度 | 时间 | 性能 |
|
|||
|
|
|------|--------|------|------|
|
|||
|
|
| 完整MapTR迁移 | ⭐⭐⭐⭐⭐ | 2周 | 高 |
|
|||
|
|
| 简化MapTRHead | ⭐⭐⭐ | 1周 | 中高 |
|
|||
|
|
| 最简矢量head | ⭐⭐ | 3天 | 中 |
|
|||
|
|
|
|||
|
|
**推荐**:简化MapTRHead方案 ⭐⭐⭐
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 💻 简化实现代码
|
|||
|
|
|
|||
|
|
### 1. 简化版MapTRHead
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# /workspace/bevfusion/mmdet3d/models/heads/vector_map/simple_maptr_head.py
|
|||
|
|
|
|||
|
|
import torch
|
|||
|
|
import torch.nn as nn
|
|||
|
|
import torch.nn.functional as F
|
|||
|
|
from mmdet.models import HEADS
|
|||
|
|
|
|||
|
|
@HEADS.register_module()
|
|||
|
|
class SimplifiedMapTRHead(nn.Module):
|
|||
|
|
"""
|
|||
|
|
简化版MapTRHead for BEVFusion
|
|||
|
|
|
|||
|
|
简化点:
|
|||
|
|
1. 直接使用BEV特征,无需复杂的多视角编码
|
|||
|
|
2. 简化Transformer为标准PyTorch实现
|
|||
|
|
3. 保留核心:Query-based + Chamfer Loss
|
|||
|
|
"""
|
|||
|
|
|
|||
|
|
def __init__(
|
|||
|
|
self,
|
|||
|
|
in_channels=256,
|
|||
|
|
num_vec=50,
|
|||
|
|
num_pts_per_vec=20,
|
|||
|
|
num_classes=3,
|
|||
|
|
embed_dims=256,
|
|||
|
|
num_decoder_layers=6,
|
|||
|
|
num_heads=8,
|
|||
|
|
pc_range=[-50, -50, -5, 50, 50, 3],
|
|||
|
|
):
|
|||
|
|
super().__init__()
|
|||
|
|
|
|||
|
|
self.num_vec = num_vec
|
|||
|
|
self.num_pts_per_vec = num_pts_per_vec
|
|||
|
|
self.num_classes = num_classes
|
|||
|
|
self.num_query = num_vec * num_pts_per_vec
|
|||
|
|
self.pc_range = pc_range
|
|||
|
|
|
|||
|
|
# 1. Query Embedding
|
|||
|
|
self.query_embed = nn.Embedding(self.num_query, embed_dims)
|
|||
|
|
|
|||
|
|
# 2. BEV特征投影
|
|||
|
|
self.bev_proj = nn.Conv2d(in_channels, embed_dims, 1)
|
|||
|
|
|
|||
|
|
# 3. Transformer Decoder (标准PyTorch)
|
|||
|
|
decoder_layer = nn.TransformerDecoderLayer(
|
|||
|
|
d_model=embed_dims,
|
|||
|
|
nhead=num_heads,
|
|||
|
|
dim_feedforward=embed_dims * 4,
|
|||
|
|
dropout=0.1,
|
|||
|
|
batch_first=True,
|
|||
|
|
)
|
|||
|
|
self.decoder = nn.TransformerDecoder(
|
|||
|
|
decoder_layer,
|
|||
|
|
num_layers=num_decoder_layers
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
# 4. 预测头
|
|||
|
|
# 4.1 分类头(每个矢量的类别)
|
|||
|
|
self.cls_head = nn.Linear(embed_dims, num_classes + 1) # +1 background
|
|||
|
|
|
|||
|
|
# 4.2 点坐标头
|
|||
|
|
self.pts_head = nn.Linear(embed_dims, 2) # (x, y)
|
|||
|
|
|
|||
|
|
# 5. 位置编码
|
|||
|
|
self.pos_embed = self._make_position_encoding(embed_dims)
|
|||
|
|
|
|||
|
|
def _make_position_encoding(self, dim):
|
|||
|
|
"""2D正弦位置编码"""
|
|||
|
|
return PositionEmbeddingSine(dim // 2, normalize=True)
|
|||
|
|
|
|||
|
|
def forward(self, bev_features, img_metas):
|
|||
|
|
"""
|
|||
|
|
Args:
|
|||
|
|
bev_features: (B, 256, H, W) BEV特征
|
|||
|
|
img_metas: 元数据
|
|||
|
|
|
|||
|
|
Returns:
|
|||
|
|
cls_scores: (B, num_vec, num_classes+1)
|
|||
|
|
pts_preds: (B, num_vec, num_pts_per_vec, 2)
|
|||
|
|
"""
|
|||
|
|
B, C, H, W = bev_features.shape
|
|||
|
|
|
|||
|
|
# 1. BEV特征处理
|
|||
|
|
bev_feat = self.bev_proj(bev_features) # (B, embed_dims, H, W)
|
|||
|
|
|
|||
|
|
# Flatten: (B, H*W, embed_dims)
|
|||
|
|
bev_memory = bev_feat.flatten(2).permute(0, 2, 1)
|
|||
|
|
|
|||
|
|
# 位置编码
|
|||
|
|
pos_embed = self.pos_embed(bev_feat)
|
|||
|
|
pos_embed = pos_embed.flatten(2).permute(0, 2, 1)
|
|||
|
|
|
|||
|
|
# 2. Query
|
|||
|
|
query = self.query_embed.weight.unsqueeze(0).repeat(B, 1, 1)
|
|||
|
|
# (B, num_query, embed_dims)
|
|||
|
|
|
|||
|
|
# 3. Transformer Decoder
|
|||
|
|
hs = self.decoder(
|
|||
|
|
tgt=query,
|
|||
|
|
memory=bev_memory,
|
|||
|
|
memory_key_padding_mask=None,
|
|||
|
|
pos=pos_embed,
|
|||
|
|
) # (B, num_query, embed_dims)
|
|||
|
|
|
|||
|
|
# 4. 预测
|
|||
|
|
# 4.1 分类(按矢量)
|
|||
|
|
# 将num_query个点重组为num_vec个矢量
|
|||
|
|
hs_vec = hs.reshape(B, self.num_vec, self.num_pts_per_vec, -1)
|
|||
|
|
hs_vec_pooled = hs_vec.mean(dim=2) # (B, num_vec, embed_dims)
|
|||
|
|
cls_scores = self.cls_head(hs_vec_pooled) # (B, num_vec, num_classes+1)
|
|||
|
|
|
|||
|
|
# 4.2 点坐标(每个点)
|
|||
|
|
pts_preds = self.pts_head(hs) # (B, num_query, 2)
|
|||
|
|
pts_preds = pts_preds.sigmoid() # 归一化到[0,1]
|
|||
|
|
pts_preds = pts_preds.reshape(B, self.num_vec, self.num_pts_per_vec, 2)
|
|||
|
|
|
|||
|
|
return cls_scores, pts_preds
|
|||
|
|
|
|||
|
|
def loss(self, cls_scores, pts_preds, gt_vectors, gt_labels):
|
|||
|
|
"""
|
|||
|
|
计算损失
|
|||
|
|
|
|||
|
|
Args:
|
|||
|
|
cls_scores: (B, num_vec, num_classes+1)
|
|||
|
|
pts_preds: (B, num_vec, num_pts_per_vec, 2) 归一化坐标
|
|||
|
|
gt_vectors: list of dicts, 每个样本的GT矢量
|
|||
|
|
gt_labels: list of tensors, 每个样本的GT标签
|
|||
|
|
"""
|
|||
|
|
B = cls_scores.shape[0]
|
|||
|
|
device = cls_scores.device
|
|||
|
|
|
|||
|
|
losses = {}
|
|||
|
|
|
|||
|
|
# 逐样本处理
|
|||
|
|
total_loss_cls = 0
|
|||
|
|
total_loss_pts = 0
|
|||
|
|
num_matched = 0
|
|||
|
|
|
|||
|
|
for b in range(B):
|
|||
|
|
if len(gt_vectors[b]) == 0:
|
|||
|
|
continue
|
|||
|
|
|
|||
|
|
# 1. Hungarian匹配
|
|||
|
|
matched_pred_idx, matched_gt_idx = self.hungarian_matching(
|
|||
|
|
cls_scores[b],
|
|||
|
|
pts_preds[b],
|
|||
|
|
gt_vectors[b],
|
|||
|
|
gt_labels[b]
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
# 2. 分类损失
|
|||
|
|
target_labels = torch.full(
|
|||
|
|
(self.num_vec,),
|
|||
|
|
self.num_classes, # background
|
|||
|
|
dtype=torch.long,
|
|||
|
|
device=device
|
|||
|
|
)
|
|||
|
|
target_labels[matched_pred_idx] = gt_labels[b][matched_gt_idx]
|
|||
|
|
|
|||
|
|
loss_cls = F.cross_entropy(
|
|||
|
|
cls_scores[b],
|
|||
|
|
target_labels,
|
|||
|
|
reduction='mean'
|
|||
|
|
)
|
|||
|
|
total_loss_cls += loss_cls
|
|||
|
|
|
|||
|
|
# 3. 点坐标损失(Chamfer Distance)
|
|||
|
|
if len(matched_pred_idx) > 0:
|
|||
|
|
pred_pts = pts_preds[b, matched_pred_idx] # (N, num_pts, 2)
|
|||
|
|
gt_pts = gt_vectors[b][matched_gt_idx] # (N, num_pts, 2)
|
|||
|
|
|
|||
|
|
# Chamfer Distance
|
|||
|
|
loss_pts = self.chamfer_distance(pred_pts, gt_pts)
|
|||
|
|
total_loss_pts += loss_pts
|
|||
|
|
num_matched += len(matched_pred_idx)
|
|||
|
|
|
|||
|
|
# 平均
|
|||
|
|
losses['loss_cls'] = total_loss_cls / B
|
|||
|
|
losses['loss_pts'] = total_loss_pts / max(num_matched, 1)
|
|||
|
|
|
|||
|
|
return losses
|
|||
|
|
|
|||
|
|
def chamfer_distance(self, pred_pts, gt_pts):
|
|||
|
|
"""
|
|||
|
|
Chamfer距离损失
|
|||
|
|
|
|||
|
|
pred_pts: (N, num_pts_pred, 2)
|
|||
|
|
gt_pts: (N, num_pts_gt, 2)
|
|||
|
|
"""
|
|||
|
|
# 距离矩阵
|
|||
|
|
dist = torch.cdist(pred_pts, gt_pts) # (N, num_pts_pred, num_pts_gt)
|
|||
|
|
|
|||
|
|
# 双向最近点距离
|
|||
|
|
loss_forward = dist.min(dim=-1)[0].mean() # 预测→GT
|
|||
|
|
loss_backward = dist.min(dim=-2)[0].mean() # GT→预测
|
|||
|
|
|
|||
|
|
cd_loss = loss_forward + loss_backward
|
|||
|
|
return cd_loss
|
|||
|
|
|
|||
|
|
def hungarian_matching(self, cls_score, pts_pred, gt_vecs, gt_labels):
|
|||
|
|
"""
|
|||
|
|
Hungarian匹配
|
|||
|
|
|
|||
|
|
简化版:只考虑点坐标距离
|
|||
|
|
"""
|
|||
|
|
from scipy.optimize import linear_sum_assignment
|
|||
|
|
|
|||
|
|
num_gt = len(gt_vecs)
|
|||
|
|
if num_gt == 0:
|
|||
|
|
return [], []
|
|||
|
|
|
|||
|
|
# 计算cost矩阵
|
|||
|
|
cost = torch.zeros(self.num_vec, num_gt).to(pts_pred.device)
|
|||
|
|
|
|||
|
|
for i in range(self.num_vec):
|
|||
|
|
for j in range(num_gt):
|
|||
|
|
# 点坐标L1距离
|
|||
|
|
dist = torch.abs(pts_pred[i] - gt_vecs[j]).sum()
|
|||
|
|
cost[i, j] = dist
|
|||
|
|
|
|||
|
|
# Hungarian
|
|||
|
|
pred_idx, gt_idx = linear_sum_assignment(cost.cpu().numpy())
|
|||
|
|
|
|||
|
|
return pred_idx, gt_idx
|
|||
|
|
|
|||
|
|
def get_vectors(self, cls_scores, pts_preds, img_metas, score_thr=0.3):
|
|||
|
|
"""
|
|||
|
|
后处理:提取矢量
|
|||
|
|
|
|||
|
|
Returns:
|
|||
|
|
list of dicts: 每个样本的矢量列表
|
|||
|
|
"""
|
|||
|
|
B = cls_scores.shape[0]
|
|||
|
|
results = []
|
|||
|
|
|
|||
|
|
for b in range(B):
|
|||
|
|
# 分类
|
|||
|
|
cls_pred = cls_scores[b].argmax(dim=-1) # (num_vec,)
|
|||
|
|
cls_conf = cls_scores[b].softmax(dim=-1).max(dim=-1)[0]
|
|||
|
|
|
|||
|
|
# 过滤
|
|||
|
|
valid = (cls_pred < self.num_classes) & (cls_conf > score_thr)
|
|||
|
|
valid_idx = valid.nonzero(as_tuple=True)[0]
|
|||
|
|
|
|||
|
|
# 反归一化
|
|||
|
|
vectors = []
|
|||
|
|
for idx in valid_idx:
|
|||
|
|
pts_norm = pts_preds[b, idx].cpu().numpy() # (num_pts, 2)
|
|||
|
|
pts_real = self.denormalize_pts(pts_norm)
|
|||
|
|
|
|||
|
|
vectors.append({
|
|||
|
|
'class': cls_pred[idx].item(),
|
|||
|
|
'class_name': ['divider', 'boundary', 'ped_crossing'][cls_pred[idx]],
|
|||
|
|
'points': pts_real.tolist(),
|
|||
|
|
'score': cls_conf[idx].item(),
|
|||
|
|
})
|
|||
|
|
|
|||
|
|
results.append({'vectors': vectors})
|
|||
|
|
|
|||
|
|
return results
|
|||
|
|
|
|||
|
|
def denormalize_pts(self, pts_norm):
|
|||
|
|
"""归一化坐标[0,1] → 真实坐标(米)"""
|
|||
|
|
pts = pts_norm.copy()
|
|||
|
|
pts[:, 0] = pts[:, 0] * (self.pc_range[3] - self.pc_range[0]) + self.pc_range[0]
|
|||
|
|
pts[:, 1] = pts[:, 1] * (self.pc_range[4] - self.pc_range[1]) + self.pc_range[1]
|
|||
|
|
return pts
|
|||
|
|
|
|||
|
|
|
|||
|
|
class PositionEmbeddingSine(nn.Module):
|
|||
|
|
"""正弦位置编码(2D)"""
|
|||
|
|
def __init__(self, num_pos_feats=128, temperature=10000, normalize=True):
|
|||
|
|
super().__init__()
|
|||
|
|
self.num_pos_feats = num_pos_feats
|
|||
|
|
self.temperature = temperature
|
|||
|
|
self.normalize = normalize
|
|||
|
|
|
|||
|
|
def forward(self, x):
|
|||
|
|
"""
|
|||
|
|
Args:
|
|||
|
|
x: (B, C, H, W)
|
|||
|
|
Returns:
|
|||
|
|
pos: (B, C*2, H, W)
|
|||
|
|
"""
|
|||
|
|
B, C, H, W = x.shape
|
|||
|
|
device = x.device
|
|||
|
|
|
|||
|
|
# 生成网格
|
|||
|
|
y_embed = torch.arange(H, dtype=torch.float32, device=device)
|
|||
|
|
x_embed = torch.arange(W, dtype=torch.float32, device=device)
|
|||
|
|
|
|||
|
|
if self.normalize:
|
|||
|
|
y_embed = y_embed / H
|
|||
|
|
x_embed = x_embed / W
|
|||
|
|
|
|||
|
|
# 正弦编码
|
|||
|
|
dim_t = torch.arange(
|
|||
|
|
self.num_pos_feats, dtype=torch.float32, device=device
|
|||
|
|
)
|
|||
|
|
dim_t = self.temperature ** (2 * (dim_t // 2) / self.num_pos_feats)
|
|||
|
|
|
|||
|
|
pos_x = x_embed[:, None] / dim_t
|
|||
|
|
pos_y = y_embed[:, None] / dim_t
|
|||
|
|
|
|||
|
|
pos_x = torch.stack([
|
|||
|
|
pos_x[:, 0::2].sin(),
|
|||
|
|
pos_x[:, 1::2].cos()
|
|||
|
|
], dim=2).flatten(1)
|
|||
|
|
|
|||
|
|
pos_y = torch.stack([
|
|||
|
|
pos_y[:, 0::2].sin(),
|
|||
|
|
pos_y[:, 1::2].cos()
|
|||
|
|
], dim=2).flatten(1)
|
|||
|
|
|
|||
|
|
# 合并
|
|||
|
|
pos = torch.cat([
|
|||
|
|
pos_y[None, :, :, None].repeat(B, 1, 1, W),
|
|||
|
|
pos_x[None, :, None, :].repeat(B, 1, H, 1)
|
|||
|
|
], dim=1)
|
|||
|
|
|
|||
|
|
return pos
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 2. 数据加载Pipeline
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# /workspace/bevfusion/mmdet3d/datasets/pipelines/loading_vector.py
|
|||
|
|
|
|||
|
|
import pickle
|
|||
|
|
import numpy as np
|
|||
|
|
from mmdet.datasets.pipelines import PIPELINES
|
|||
|
|
|
|||
|
|
@PIPELINES.register_module()
|
|||
|
|
class LoadVectorMapAnnotation:
|
|||
|
|
"""
|
|||
|
|
加载矢量地图标注
|
|||
|
|
"""
|
|||
|
|
def __init__(
|
|||
|
|
self,
|
|||
|
|
vector_map_file='data/nuscenes/vector_maps.pkl',
|
|||
|
|
num_vec_classes=3,
|
|||
|
|
max_num_vecs=50,
|
|||
|
|
num_pts_per_vec=20,
|
|||
|
|
pc_range=[-50, -50, -5, 50, 50, 3],
|
|||
|
|
):
|
|||
|
|
self.max_num_vecs = max_num_vecs
|
|||
|
|
self.num_pts_per_vec = num_pts_per_vec
|
|||
|
|
self.pc_range = pc_range
|
|||
|
|
|
|||
|
|
# 加载矢量地图数据
|
|||
|
|
with open(vector_map_file, 'rb') as f:
|
|||
|
|
self.vector_maps = pickle.load(f)
|
|||
|
|
|
|||
|
|
def __call__(self, results):
|
|||
|
|
"""
|
|||
|
|
加载当前样本的矢量地图
|
|||
|
|
|
|||
|
|
输出格式:
|
|||
|
|
gt_vectors: (max_num_vecs, num_pts_per_vec, 2)
|
|||
|
|
gt_vec_labels: (max_num_vecs,)
|
|||
|
|
gt_vec_masks: (max_num_vecs,) bool类型
|
|||
|
|
"""
|
|||
|
|
sample_token = results['sample_idx']
|
|||
|
|
|
|||
|
|
# 获取矢量数据
|
|||
|
|
vectors_raw = self.vector_maps.get(sample_token, {})
|
|||
|
|
|
|||
|
|
# 初始化
|
|||
|
|
gt_vectors = np.zeros((self.max_num_vecs, self.num_pts_per_vec, 2))
|
|||
|
|
gt_labels = np.full((self.max_num_vecs,), -1, dtype=np.int64)
|
|||
|
|
gt_masks = np.zeros((self.max_num_vecs,), dtype=bool)
|
|||
|
|
|
|||
|
|
vec_idx = 0
|
|||
|
|
|
|||
|
|
# Divider (class 0)
|
|||
|
|
for vec in vectors_raw.get('divider', []):
|
|||
|
|
if vec_idx >= self.max_num_vecs:
|
|||
|
|
break
|
|||
|
|
pts = self.resample_polyline(vec['points'], self.num_pts_per_vec)
|
|||
|
|
pts_norm = self.normalize_pts(pts)
|
|||
|
|
gt_vectors[vec_idx] = pts_norm
|
|||
|
|
gt_labels[vec_idx] = 0
|
|||
|
|
gt_masks[vec_idx] = True
|
|||
|
|
vec_idx += 1
|
|||
|
|
|
|||
|
|
# Boundary (class 1)
|
|||
|
|
for vec in vectors_raw.get('boundary', []):
|
|||
|
|
if vec_idx >= self.max_num_vecs:
|
|||
|
|
break
|
|||
|
|
pts = self.resample_polyline(vec['points'], self.num_pts_per_vec)
|
|||
|
|
pts_norm = self.normalize_pts(pts)
|
|||
|
|
gt_vectors[vec_idx] = pts_norm
|
|||
|
|
gt_labels[vec_idx] = 1
|
|||
|
|
gt_masks[vec_idx] = True
|
|||
|
|
vec_idx += 1
|
|||
|
|
|
|||
|
|
# Ped crossing (class 2)
|
|||
|
|
for vec in vectors_raw.get('ped_crossing', []):
|
|||
|
|
if vec_idx >= self.max_num_vecs:
|
|||
|
|
break
|
|||
|
|
pts = self.resample_polyline(vec['points'], self.num_pts_per_vec)
|
|||
|
|
pts_norm = self.normalize_pts(pts)
|
|||
|
|
gt_vectors[vec_idx] = pts_norm
|
|||
|
|
gt_labels[vec_idx] = 2
|
|||
|
|
gt_masks[vec_idx] = True
|
|||
|
|
vec_idx += 1
|
|||
|
|
|
|||
|
|
results['gt_vectors'] = gt_vectors
|
|||
|
|
results['gt_vec_labels'] = gt_labels
|
|||
|
|
results['gt_vec_masks'] = gt_masks
|
|||
|
|
|
|||
|
|
return results
|
|||
|
|
|
|||
|
|
def resample_polyline(self, points, num_pts):
|
|||
|
|
"""重采样多段线到固定点数"""
|
|||
|
|
from scipy.interpolate import interp1d
|
|||
|
|
|
|||
|
|
points = np.array(points)
|
|||
|
|
if len(points) < 2:
|
|||
|
|
return np.zeros((num_pts, 2))
|
|||
|
|
|
|||
|
|
# 累积距离
|
|||
|
|
dists = np.sqrt(np.sum(np.diff(points, axis=0)**2, axis=1))
|
|||
|
|
cum_dists = np.concatenate([[0], np.cumsum(dists)])
|
|||
|
|
|
|||
|
|
# 均匀采样
|
|||
|
|
sample_dists = np.linspace(0, cum_dists[-1], num_pts)
|
|||
|
|
|
|||
|
|
# 插值
|
|||
|
|
interp_x = interp1d(cum_dists, points[:, 0], kind='linear')
|
|||
|
|
interp_y = interp1d(cum_dists, points[:, 1], kind='linear')
|
|||
|
|
|
|||
|
|
resampled = np.stack([
|
|||
|
|
interp_x(sample_dists),
|
|||
|
|
interp_y(sample_dists)
|
|||
|
|
], axis=-1)
|
|||
|
|
|
|||
|
|
return resampled
|
|||
|
|
|
|||
|
|
def normalize_pts(self, pts):
|
|||
|
|
"""真实坐标 → [0,1]"""
|
|||
|
|
pts_norm = pts.copy()
|
|||
|
|
pts_norm[:, 0] = (pts[:, 0] - self.pc_range[0]) / (self.pc_range[3] - self.pc_range[0])
|
|||
|
|
pts_norm[:, 1] = (pts[:, 1] - self.pc_range[1]) / (self.pc_range[4] - self.pc_range[1])
|
|||
|
|
return pts_norm
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 3. 修改BEVFusion模型
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# /workspace/bevfusion/mmdet3d/models/fusion_models/bevfusion.py
|
|||
|
|
|
|||
|
|
def forward_single(self, ..., gt_vectors=None, gt_vec_labels=None, ...):
|
|||
|
|
"""
|
|||
|
|
新增矢量地图参数
|
|||
|
|
"""
|
|||
|
|
# ... 前面的代码不变(encoder、fuser、decoder)
|
|||
|
|
|
|||
|
|
x = self.decoder["backbone"](x)
|
|||
|
|
x = self.decoder["neck"](x)
|
|||
|
|
|
|||
|
|
if self.training:
|
|||
|
|
outputs = {}
|
|||
|
|
|
|||
|
|
# Task 1: 检测
|
|||
|
|
if 'object' in self.heads:
|
|||
|
|
# ... 原有代码
|
|||
|
|
|
|||
|
|
# Task 2: 分割
|
|||
|
|
if 'map' in self.heads:
|
|||
|
|
# ... 原有代码
|
|||
|
|
|
|||
|
|
# Task 3: 矢量地图 🆕
|
|||
|
|
if 'vector_map' in self.heads:
|
|||
|
|
cls_scores, pts_preds = self.heads['vector_map'](x, metas)
|
|||
|
|
vec_losses = self.heads['vector_map'].loss(
|
|||
|
|
cls_scores, pts_preds,
|
|||
|
|
gt_vectors, gt_vec_labels
|
|||
|
|
)
|
|||
|
|
for name, val in vec_losses.items():
|
|||
|
|
outputs[f"loss/vector_map/{name}"] = val * self.loss_scale.get('vector_map', 1.0)
|
|||
|
|
|
|||
|
|
return outputs
|
|||
|
|
else:
|
|||
|
|
# 推理模式
|
|||
|
|
outputs = [{} for _ in range(batch_size)]
|
|||
|
|
|
|||
|
|
if 'vector_map' in self.heads:
|
|||
|
|
cls_scores, pts_preds = self.heads['vector_map'](x, metas)
|
|||
|
|
vectors = self.heads['vector_map'].get_vectors(
|
|||
|
|
cls_scores, pts_preds, metas
|
|||
|
|
)
|
|||
|
|
for k, vec in enumerate(vectors):
|
|||
|
|
outputs[k]['vector_map'] = vec
|
|||
|
|
|
|||
|
|
return outputs
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📦 完整实施包
|
|||
|
|
|
|||
|
|
### 需要创建的文件
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
/workspace/bevfusion/
|
|||
|
|
├── mmdet3d/models/heads/vector_map/
|
|||
|
|
│ ├── __init__.py
|
|||
|
|
│ └── simple_maptr_head.py ★ 新建(上述代码)
|
|||
|
|
│
|
|||
|
|
├── mmdet3d/datasets/pipelines/
|
|||
|
|
│ └── loading_vector.py ★ 新建(上述代码)
|
|||
|
|
│
|
|||
|
|
├── tools/data_converter/
|
|||
|
|
│ └── extract_vector_map_bevfusion.py ★ 新建
|
|||
|
|
│
|
|||
|
|
└── configs/nuscenes/three_tasks/
|
|||
|
|
├── default.yaml
|
|||
|
|
└── bevfusion_det_seg_vec.yaml ★ 新建
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## ⏱️ 实施时间估算
|
|||
|
|
|
|||
|
|
| 任务 | 时间 | 说明 |
|
|||
|
|
|------|------|------|
|
|||
|
|
| 研究MapTR代码 | ✅ 完成 | 今天完成 |
|
|||
|
|
| 复制和适配代码 | 1天 | SimplifiedMapTRHead |
|
|||
|
|
| 实现数据Pipeline | 1天 | LoadVectorMapAnnotation |
|
|||
|
|
| 提取矢量地图数据 | 0.5天 | 运行提取脚本 |
|
|||
|
|
| 小规模测试 | 0.5天 | 100样本测试 |
|
|||
|
|
| 完整训练 | 2-3天 | 三任务训练 |
|
|||
|
|
| 评估和调优 | 1天 | 性能评估 |
|
|||
|
|
| **总计** | **6-7天** | **约1周** |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🎯 下一步行动
|
|||
|
|
|
|||
|
|
### 训练期间可做(本周)
|
|||
|
|
- [x] MapTR代码研究 ✅
|
|||
|
|
- [ ] 实现SimplifiedMapTRHead
|
|||
|
|
- [ ] 实现LoadVectorMapAnnotation
|
|||
|
|
- [ ] 准备数据提取脚本
|
|||
|
|
|
|||
|
|
### 训练完成后(10-30开始)
|
|||
|
|
- [ ] 提取矢量地图数据(30分钟)
|
|||
|
|
- [ ] 小规模测试(4小时)
|
|||
|
|
- [ ] 完整三任务训练(2-3天)
|
|||
|
|
- [ ] 性能评估和优化(1天)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 💡 关键建议
|
|||
|
|
|
|||
|
|
### 简化原则
|
|||
|
|
1. **复用BEVFusion的BEV特征** - 不需要MapTR的encoder
|
|||
|
|
2. **使用标准Transformer** - 不需要复杂的几何注意力
|
|||
|
|
3. **保留核心机制** - Query-based + Chamfer Loss
|
|||
|
|
4. **简化数据格式** - 固定点数,简化处理
|
|||
|
|
|
|||
|
|
### 风险控制
|
|||
|
|
1. **小规模测试优先** - 100样本验证可行性
|
|||
|
|
2. **分阶段训练** - 先训练矢量head,再联合
|
|||
|
|
3. **性能监控** - 确保检测和分割不下降
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
**MapTR研究完成!可以开始实施集成。**
|
|||
|
|
|