bev-project/project/docs/MapTR代码研究报告.md

969 lines
25 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# MapTR 代码深度研究报告
**研究时间**2025-10-22
**MapTR版本**main分支 + maptrv2分支
**代码位置**/workspace/MapTR
**核心代码**3540行Python代码
---
## 📊 MapTR项目概况
### 项目信息
- **论文**ICLR 2023 Spotlight
- **扩展版**MapTRv2 (IJCV 2024)
- **功能**:在线矢量化高精地图构建
- **性能**nuScenes mAP 50-73%(不同配置)
- **速度**14-35 FPSRTX 3090
### 核心创新
1. **统一的点集建模**:将地图元素建模为点集
2. **层次化Query Embedding**:灵活编码结构化地图信息
3. **端到端学习**:从图像直接预测矢量地图
---
## 📁 代码结构
### 目录组织
```
MapTR/
├── projects/mmdet3d_plugin/maptr/ # 核心代码
│ ├── dense_heads/
│ │ └── maptr_head.py ★ 35KB 核心Head实现
│ ├── detectors/
│ │ └── maptr.py ★ 19KB 主模型
│ ├── modules/
│ │ ├── decoder.py ★ 3KB Transformer解码器
│ │ ├── encoder.py ★ 12KB BEV编码器
│ │ ├── transformer.py ★ 15KB 整体Transformer
│ │ └── geometry_kernel_attention.py ★ 23KB 几何注意力
│ ├── losses/
│ │ └── map_loss.py ★ 26KB 损失函数
│ └── assigners/
│ └── maptr_assigner.py ★ 9KB Hungarian匹配
├── mmdetection3d/ # 依赖的mmdet3d
├── tools/ # 训练测试工具
│ └── maptr/
│ ├── test.py
│ └── vis_pred.py # 可视化工具
└── configs/ # 配置文件
└── maptr/
```
**总代码量**3540行Python代码核心部分
---
## 🔍 核心组件详解
### 1. MapTRHead★ 最核心)
**文件**`projects/mmdet3d_plugin/maptr/dense_heads/maptr_head.py` (35KB)
#### 关键参数
```python
class MapTRHead(DETRHead):
def __init__(
self,
num_vec=20, # 预测的矢量数量
num_pts_per_vec=20, # 每个矢量的点数
num_pts_per_gt_vec=2, # GT矢量的固定点数
query_embed_type='all_pts', # Query类型
bev_h=30, # BEV高度网格
bev_w=30, # BEV宽度网格
loss_pts=dict( # 点损失Chamfer Distance
type='ChamferDistance',
loss_src_weight=1.0,
loss_dst_weight=1.0
),
loss_dir=dict( # 方向损失
type='PtsDirCosLoss',
loss_weight=2.0
),
...
)
```
#### 核心方法分析
**1.1 Forward方法**
```python
def forward(self, mlvl_feats, lidar_feat, img_metas, prev_bev=None):
"""
输入:
mlvl_feats: 多层级图像特征 (B, N, C, H, W)
lidar_feat: LiDAR特征可选
img_metas: 元数据
prev_bev: 前一帧BEV时序
输出:
bev_embed: BEV特征嵌入
hs: 隐藏状态 (num_layers, num_query, bs, embed_dims)
init_reference: 初始参考点
inter_references: 中间参考点
流程:
1. 构建Query Embedding
2. Transformer编码BEV特征
3. Transformer解码得到矢量预测
4. 多层输出(用于深度监督)
"""
# Query Embedding
object_query_embeds = self.query_embedding.weight
# num_query = num_vec × num_pts_per_vec
# 例如: 20个矢量 × 20个点 = 400个query
# Transformer
outputs = self.transformer(
mlvl_feats, # 图像特征
lidar_feat, # LiDAR特征
bev_queries, # BEV query
object_query_embeds, # 矢量query
...
)
# 解析输出
bev_embed, hs, init_reference, inter_references = outputs
# 分类和回归
for lvl in range(hs.shape[0]): # 每个decoder层
outputs_class = self.cls_branches[lvl](hs[lvl]) # 分类
outputs_coord = self.reg_branches[lvl](hs[lvl]) # 点坐标
return all_cls_scores, all_pts_preds
```
**1.2 Loss方法**
```python
def loss(self, gt_bboxes_list, gt_labels_list, ...):
"""
核心损失函数
包括:
1. 分类损失 (FocalLoss)
2. 点坐标损失 (Chamfer Distance)
3. 方向损失 (Cosine Loss)
4. Hungarian匹配
关键步骤:
1. Hungarian匹配预测和GT
2. 计算matched的loss
3. 处理unmatched背景
"""
# Hungarian匹配
cls_reg_targets = self.get_targets(
gt_bboxes_list, gt_labels_list, ...)
# 分类损失
loss_cls = self.loss_cls(
cls_scores, labels, ...)
# 点损失Chamfer Distance
loss_pts = self.loss_pts(
pts_preds, pts_targets, ...)
# 方向损失
loss_dir = self.loss_dir(
pts_preds, pts_targets, ...)
return loss_dict
```
---
### 2. Transformer结构
**文件**`projects/mmdet3d_plugin/maptr/modules/`
#### 2.1 BEV Encoder
```python
# encoder.py
class BEVFormerEncoder(BaseModule):
"""
将多视角图像特征转换为BEV特征
支持多种方式:
- GKT (Geometry Kernel Attention)
- BEVFormer
- BEVPool (BEVFusion方式)
"""
```
#### 2.2 Decoder
```python
# decoder.py
class MapTRDecoder(TransformerLayerSequence):
"""
基于DETR的解码器
功能:
- Query-based检测
- 迭代refinement
- 输出多层预测(深度监督)
核心:
- 输入: query embedding
- 输出: 更新后的query包含矢量信息
"""
def forward(self, query, reference_points, reg_branches=None):
# 多层Transformer Decoder
for lid, layer in enumerate(self.layers):
output = layer(output, reference_points=reference_points)
# Iterative refinement
if reg_branches is not None:
tmp = reg_branches[lid](output)
new_reference_points = tmp + reference_points
reference_points = new_reference_points.sigmoid()
return output, reference_points
```
---
### 3. 损失函数
**文件**`projects/mmdet3d_plugin/maptr/losses/map_loss.py` (26KB)
#### 3.1 Chamfer Distance Loss
```python
@LOSSES.register_module()
class ChamferDistance(nn.Module):
"""
Chamfer距离点集之间的双向最近点距离
计算公式:
CD(P, Q) = Σ min||p-q|| + Σ min||q-p||
p∈P q∈Q q∈Q p∈P
用于:
- 衡量预测矢量和GT矢量的相似度
- 允许点的顺序不同
"""
def forward(self, src, tgt):
# 计算距离矩阵
dist = torch.cdist(src, tgt) # (N, M)
# 单向Chamfer
loss_src = dist.min(dim=1)[0].mean() # 从预测到GT
loss_tgt = dist.min(dim=0)[0].mean() # 从GT到预测
# 总损失
loss = loss_src * loss_src_weight + loss_tgt * loss_dst_weight
return loss
```
#### 3.2 方向损失
```python
@LOSSES.register_module()
class PtsDirCosLoss(nn.Module):
"""
点方向余弦损失
目的:
- 确保预测的矢量方向正确
- 使用余弦相似度
"""
def forward(self, pts_pred, pts_gt):
# 计算方向向量
dir_pred = pts_pred[:, 1:] - pts_pred[:, :-1]
dir_gt = pts_gt[:, 1:] - pts_gt[:, :-1]
# 余弦相似度
cos_sim = F.cosine_similarity(dir_pred, dir_gt, dim=-1)
loss = 1 - cos_sim
return loss.mean()
```
---
### 4. Hungarian匹配
**文件**`projects/mmdet3d_plugin/maptr/assigners/maptr_assigner.py` (9KB)
```python
class MapTRAssigner:
"""
Hungarian匹配算法
为每个GT找到最佳匹配的预测
Cost矩阵 = 分类cost + 点坐标cost + 方向cost
"""
def assign(self, bbox_pred, cls_pred, gt_bboxes, gt_labels):
# 计算cost矩阵
cls_cost = self.cls_cost(cls_pred, gt_labels)
pts_cost = self.pts_cost(bbox_pred, gt_bboxes)
dir_cost = self.dir_cost(bbox_pred, gt_bboxes)
# 总cost
cost = cls_cost + pts_cost + dir_cost
# Hungarian算法
from scipy.optimize import linear_sum_assignment
matched_row_inds, matched_col_inds = linear_sum_assignment(cost.cpu())
return matched_row_inds, matched_col_inds
```
---
## 🔧 集成到BEVFusion的方案
### 关键发现
**MapTR的优势**
1.**已支持BEVPool**可以直接使用BEVFusion的BEV特征
2.**模块化设计**Head可以独立使用
3.**多种BEV编码器**GKT、BEVFormer、BEVPool都支持
**集成策略**
```python
# 不需要MapTR的完整模型
# 只需要提取MapTRHead部分
从MapTR提取:
├── MapTRHead (dense_heads/maptr_head.py)
├── MapTRDecoder (modules/decoder.py)
├── ChamferDistance Loss (losses/map_loss.py)
├── PtsDirCosLoss (losses/map_loss.py)
└── MapTRAssigner (assigners/maptr_assigner.py)
复用BEVFusion:
├── Camera Encoder
├── LiDAR Encoder
├── ConvFuser
└── BEV Decoder 直接输出给MapTRHead
```
---
## 💡 核心技术要点
### 1. Query Embedding设计
MapTR使用了创新的Query设计
```python
# 方式1: all_pts (全部点作为独立query)
num_query = num_vec × num_pts_per_vec
# 例如: 20个矢量 × 20个点 = 400个query
# 方式2: instance_pts (矢量+点的组合embedding)
pts_embeds = self.pts_embedding.weight # (num_pts, dim)
instance_embeds = self.instance_embedding.weight # (num_vec, dim)
query_embeds = pts_embeds + instance_embeds # 广播相加
```
**优势**
- 灵活表示不定数量的矢量
- 点集建模permutation-equivariant
- 支持可变长度矢量
---
### 2. 点集表示
**归一化坐标**
```python
def normalize_2d_pts(pts, pc_range):
"""
将真实坐标归一化到[0,1]
pc_range: [-50, -50, -5, 50, 50, 3] # BEV范围
"""
patch_h = pc_range[4] - pc_range[1] # 100m
patch_w = pc_range[3] - pc_range[0] # 100m
normalized_pts = pts.clone()
normalized_pts[..., 0] = (pts[..., 0] - pc_range[0]) / patch_w
normalized_pts[..., 1] = (pts[..., 1] - pc_range[1]) / patch_h
return normalized_pts # [0, 1]范围
```
**反归一化**
```python
def denormalize_2d_pts(pts, pc_range):
"""
[0,1] → 真实坐标(米)
"""
new_pts = pts.clone()
new_pts[..., 0] = pts[..., 0] * (pc_range[3] - pc_range[0]) + pc_range[0]
new_pts[..., 1] = pts[..., 1] * (pc_range[4] - pc_range[1]) + pc_range[1]
return new_pts
```
---
### 3. 数据格式
**GT矢量地图格式**
```python
gt_vecs_list = [
{
'vectors': [
{
'pts': torch.Tensor([[x1,y1], [x2,y2], ...]), # N个点
'pts_num': N,
'type': 0, # 0:divider, 1:boundary, 2:ped_crossing
},
...
]
},
... # batch中的每个样本
]
```
**预测输出格式**
```python
predictions = {
'all_cls_scores': (num_layers, bs, num_vec, num_classes),
'all_pts_preds': (num_layers, bs, num_vec×num_pts, 2), # (x,y)坐标
}
```
---
## 🎯 适配BEVFusion的要点
### 修改点1输入接口
**MapTR原始**
```python
class MapTR(MVXTwoStageDetector):
# 继承自mmdet3d的双阶段检测器
# 有自己的backbone、neck等
```
**适配BEVFusion**
```python
# 只需要MapTRHead不需要完整MapTR模型
# BEVFusion已经有backbone和BEV特征
class BEVFusion:
def __init__(self):
# ... 现有的encoder、fuser、decoder
# 新增MapTRHead
self.heads['vector_map'] = MapTRHead(
in_channels=256, # BEV decoder输出通道
num_vec=50, # 调整为BEVFusion适用
num_pts_per_vec=20,
...
)
def forward(self, ...):
# 获取BEV特征
bev_features = self.decoder(fused_features)
# MapTRHead forward
# 注意MapTRHead期望的输入是mlvl_feats
# 需要适配为BEV特征
outputs = self.heads['vector_map'](
bev_features, # 适配输入
img_metas=metas
)
```
---
### 修改点2BEV特征适配
**MapTR期望**
```python
# 多层级特征 (multi-level features)
mlvl_feats = [feat1, feat2, feat3, ...]
```
**BEVFusion提供**
```python
# 单层BEV特征
bev_features = (B, 256, 180, 180)
```
**适配方案**
```python
# 方案1: 包装为list
mlvl_feats = [bev_features]
# 方案2: 修改MapTRHead直接接受BEV特征
class MapTRHeadForBEVFusion(MapTRHead):
def forward(self, bev_features, img_metas):
# 直接使用BEV特征跳过图像编码部分
mlvl_feats = [bev_features]
return super().forward(mlvl_feats, None, img_metas)
```
---
### 修改点3数据Pipeline
**需要新增**
```python
# mmdet3d/datasets/pipelines/loading.py
@PIPELINES.register_module()
class LoadVectorMapAnnotation:
"""
加载矢量地图标注
从nuScenes map API提取矢量元素
"""
def __call__(self, results):
# 提取车道线、边界等
vectors = extract_vectors_from_nuscenes(
results['sample_token'],
x_range=[-50, 50],
y_range=[-50, 50]
)
results['gt_vectors'] = vectors
return results
```
---
## 📝 集成实施步骤
### Step 1: 复制MapTR核心代码1天
```bash
# 创建目录
mkdir -p /workspace/bevfusion/mmdet3d/models/heads/vector_map
mkdir -p /workspace/bevfusion/mmdet3d/models/losses
# 复制文件
cp /workspace/MapTR/projects/mmdet3d_plugin/maptr/dense_heads/maptr_head.py \
/workspace/bevfusion/mmdet3d/models/heads/vector_map/
cp /workspace/MapTR/projects/mmdet3d_plugin/maptr/losses/map_loss.py \
/workspace/bevfusion/mmdet3d/models/losses/
cp /workspace/MapTR/projects/mmdet3d_plugin/maptr/assigners/maptr_assigner.py \
/workspace/bevfusion/mmdet3d/core/bbox/assigners/
cp /workspace/MapTR/projects/mmdet3d_plugin/maptr/modules/decoder.py \
/workspace/bevfusion/mmdet3d/models/utils/
```
### Step 2: 修改代码适配BEVFusion2天
**2.1 修改MapTRHead**
```python
# 简化输入接口
class MapTRHeadForBEVFusion(MapTRHead):
def forward(self, bev_features, img_metas):
"""
简化版forward直接接受BEV特征
Args:
bev_features: (B, 256, 180, 180) BEV特征
img_metas: 元数据
"""
# 跳过图像编码直接使用BEV特征
# ...省略复杂的多视角处理
# 构建Query
query_embeds = self.query_embedding.weight
# Decoder保持不变
hs, references = self.decoder(query_embeds, bev_features)
# 分类和回归(保持不变)
cls_scores = self.cls_head(hs)
pts_preds = self.reg_head(hs)
return cls_scores, pts_preds
```
**2.2 注册到BEVFusion**
```python
# mmdet3d/models/heads/__init__.py
from .vector_map import MapTRHeadForBEVFusion
__all__ = [
...,
'MapTRHeadForBEVFusion',
]
```
---
### Step 3: 数据准备1天
```bash
# 提取矢量地图
python tools/data_converter/extract_vector_map.py \
--root data/nuscenes \
--version v1.0-trainval \
--output data/nuscenes/vector_maps.pkl
# 验证数据
python tools/visualize_vector_map.py --samples 10
```
---
### Step 4: 配置文件0.5天)
```yaml
# configs/nuscenes/three_tasks/bevfusion_det_seg_vec.yaml
model:
type: BEVFusion
# ... 现有的encoder、fuser、decoder
heads:
object: ${object_head} # 已有
map: ${map_head} # 已有
# 新增MapTR矢量地图head
vector_map:
type: MapTRHeadForBEVFusion
in_channels: 256
num_vec: 50
num_pts_per_vec: 20
num_classes: 3 # divider, boundary, ped_crossing
embed_dims: 256
num_decoder_layers: 6
loss_pts:
type: ChamferDistance
loss_src_weight: 1.0
loss_dst_weight: 1.0
loss_dir:
type: PtsDirCosLoss
loss_weight: 2.0
loss_scale:
object: 1.0
map: 1.0
vector_map: 1.0
# Pipeline
train_pipeline:
- type: LoadMultiViewImageFromFiles
- type: LoadPointsFromFile
- type: LoadAnnotations3D
- type: LoadVectorMapAnnotation 🆕
# ...
```
---
### Step 5: 训练5-7天
```bash
# 阶段1: 冻结前两个任务训练MapTRHead3 epochs
torchpack dist-run -np 8 python tools/train.py \
configs/nuscenes/three_tasks/bevfusion_det_seg_vec.yaml \
--load_from runs/enhanced_from_epoch19/epoch_23.pth \
--freeze-heads object,map \
--cfg-options max_epochs=3
# 阶段2: 三任务联合fine-tune5 epochs
torchpack dist-run -np 8 python tools/train.py \
configs/nuscenes/three_tasks/bevfusion_det_seg_vec.yaml \
--load_from runs/three_tasks_stage1/epoch_3.pth \
--cfg-options max_epochs=5
```
---
## 📊 关键代码片段
### MapTRHead核心逻辑
```python
# 从maptr_head.py提取的核心流程
class MapTRHead:
def __init__(self, num_vec=20, num_pts_per_vec=20, ...):
# 总query数 = 矢量数 × 每矢量点数
self.num_query = num_vec * num_pts_per_vec # 400
# Query embedding
self.query_embedding = nn.Embedding(self.num_query, embed_dims)
# 分类分支(预测矢量类别)
self.cls_branches = nn.ModuleList([
nn.Linear(embed_dims, num_classes)
for _ in range(num_decoder_layers)
])
# 回归分支(预测点坐标)
self.reg_branches = nn.ModuleList([
nn.Linear(embed_dims, 2) # (x, y)
for _ in range(num_decoder_layers)
])
def forward(self, bev_features, img_metas):
# 1. Query
query = self.query_embedding.weight # (400, 256)
# 2. Decoder
hs = self.decoder(query, bev_features) # (6, 400, B, 256)
# 3. 预测(每层)
all_cls_scores = []
all_pts_preds = []
for layer_idx in range(6):
# 分类:(B, 400, 256) → (B, 20, 3)
# 每个矢量的分类400个点 → 20个矢量
cls = self.cls_branches[layer_idx](
hs[layer_idx].reshape(B, 20, 20, 256).mean(dim=2)
)
# 回归:(B, 400, 256) → (B, 400, 2)
pts = self.reg_branches[layer_idx](hs[layer_idx])
pts = pts.sigmoid() # 归一化到[0,1]
all_cls_scores.append(cls)
all_pts_preds.append(pts)
return all_cls_scores, all_pts_preds
def loss(self, cls_scores, pts_preds, gt_vectors):
# 1. Hungarian匹配
indices = self.assigner.assign(pts_preds, cls_scores, gt_vectors)
# 2. 分类损失
loss_cls = self.loss_cls(cls_scores, gt_labels, indices)
# 3. Chamfer Distance
loss_pts = self.loss_pts(pts_preds, gt_pts, indices)
# 4. 方向损失
loss_dir = self.loss_dir(pts_preds, gt_pts, indices)
return {
'loss_cls': loss_cls,
'loss_pts': loss_pts,
'loss_dir': loss_dir,
}
```
---
## 🔍 关键技术细节
### Chamfer Distance计算
```python
def chamfer_distance(pred_pts, gt_pts):
"""
pred_pts: (B, N, num_pts_pred, 2) # 预测点
gt_pts: (B, N, num_pts_gt, 2) # GT点
返回: 双向最近点距离之和
"""
# 计算距离矩阵 (B, N, num_pts_pred, num_pts_gt)
dist = torch.cdist(pred_pts, gt_pts)
# 预测→GT的最近距离
dist_pred_to_gt = dist.min(dim=-1)[0] # (B, N, num_pts_pred)
loss_forward = dist_pred_to_gt.mean()
# GT→预测的最近距离
dist_gt_to_pred = dist.min(dim=-2)[0] # (B, N, num_pts_gt)
loss_backward = dist_gt_to_pred.mean()
# 双向Chamfer
cd_loss = loss_forward + loss_backward
return cd_loss
```
**优势**
- ✅ 允许点的数量不同
- ✅ 允许点的顺序不同
- ✅ 对点集建模友好
---
### Hungarian匹配算法
```python
from scipy.optimize import linear_sum_assignment
def hungarian_matching(cost_matrix):
"""
cost_matrix: (num_pred, num_gt)
返回: (matched_pred_idx, matched_gt_idx)
"""
# 计算cost
# Cost = α·cls_cost + β·pts_cost + γ·dir_cost
cost = (
2.0 * cls_cost + # 分类cost
5.0 * pts_cost + # 点坐标cost
1.0 * dir_cost # 方向cost
)
# Hungarian算法找到最优匹配
pred_idx, gt_idx = linear_sum_assignment(cost.cpu().numpy())
return pred_idx, gt_idx
```
---
## 💾 数据提取工具
MapTR提供了数据提取工具我们可以参考
```bash
# MapTR的数据准备
cd /workspace/MapTR
ls tools/maptrv2/
# 核心工具:
- gen_ann.py # 生成标注
- generate_*_info.py # 生成info文件
```
**我们需要的**
```python
# /workspace/bevfusion/tools/data_converter/extract_vector_map.py
from nuscenes.nuscenes import NuScenes
from nuscenes.map_expansion.map_api import NuScenesMap
def extract_vector_map(sample_token, nusc, nusc_map):
"""
提取单个样本的矢量地图
返回:
vectors: 列表of矢量元素
"""
# 获取ego pose
sample = nusc.get('sample', sample_token)
sd_token = sample['data']['LIDAR_TOP']
sd_rec = nusc.get('sample_data', sd_token)
pose_rec = nusc.get('ego_pose', sd_rec['ego_pose_token'])
# 在ego周围提取矢量
vectors = []
# 1. 车道分隔线
lanes = nusc_map.get_records_in_patch(
[pose_rec['translation'][0]-50, pose_rec['translation'][1]-50,
pose_rec['translation'][0]+50, pose_rec['translation'][1]+50],
layer_names=['lane_divider'],
mode='intersect'
)
for lane_token in lanes:
line = nusc_map.extract_line(lane_token)
# 转换到ego坐标系
pts_global = np.array(line.coords)
pts_ego = transform_to_ego(pts_global, pose_rec)
vectors.append({
'pts': pts_ego,
'type': 0, # divider
})
# 2. 道路边界 ...
# 3. 人行横道 ...
return vectors
```
---
## 🎯 集成后的完整流程
```python
# BEVFusion + MapTR集成后的训练流程
class BEVFusionWithVectorMap(BEVFusion):
def forward_train(self, img, points, gt_bboxes_3d, gt_labels_3d,
gt_masks_bev, gt_vectors, img_metas):
"""
增加了gt_vectors参数
"""
# 1. 特征提取(不变)
camera_feat = self.extract_camera_features(img, img_metas)
lidar_feat = self.extract_lidar_features(points)
# 2. 融合(不变)
fused_feat = self.fuser([camera_feat, lidar_feat])
# 3. 解码(不变)
bev_feat = self.decoder(fused_feat)
# 4. 多任务Head
losses = {}
# Task 1: 检测
if 'object' in self.heads:
det_pred = self.heads['object'](bev_feat, img_metas)
det_loss = self.heads['object'].loss(det_pred, gt_bboxes_3d, gt_labels_3d)
for k, v in det_loss.items():
losses[f'loss/object/{k}'] = v
# Task 2: 分割
if 'map' in self.heads:
seg_loss = self.heads['map'](bev_feat, gt_masks_bev)
for k, v in seg_loss.items():
losses[f'loss/map/{k}'] = v
# Task 3: 矢量地图 🆕
if 'vector_map' in self.heads:
vec_cls, vec_pts = self.heads['vector_map'](bev_feat, img_metas)
vec_loss = self.heads['vector_map'].loss(vec_cls, vec_pts, gt_vectors)
for k, v in vec_loss.items():
losses[f'loss/vector_map/{k}'] = v
return losses
```
---
## 📚 学习资源
### 论文
- MapTR (ICLR 2023): https://arxiv.org/abs/2208.14437
- MapTRv2 (IJCV 2024): https://arxiv.org/abs/2308.05736
### 代码
- GitHub: https://github.com/hustvl/MapTR
- 本地路径: /workspace/MapTR
### 关键文件
- maptr_head.py: 35KB核心Head实现
- map_loss.py: 26KB损失函数
- decoder.py: 3KBTransformer解码器
---
## 🎯 总结
### MapTR的核心价值
1.**Query-based矢量预测**:适合不定数量的地图元素
2.**点集建模**:灵活表示各种形状
3.**端到端**:直接从图像到矢量
4.**高性能**mAP 50-73%
### 集成BEVFusion的优势
1.**复用BEV特征**不需要重新训练backbone
2.**模块化**只需要MapTRHead部分
3.**数据共享**使用相同的nuScenes数据
4.**快速训练**只训练15M新参数
### 预期效果
- 三任务联合性能
- 检测mAP: 64-66%
- 分割mIoU: 55-58%
- 矢量地图mAP: 50-55%
- 训练时间: 2-3天
---
**下一步**是否开始实施MapTR集成