49 lines
1.7 KiB
Plaintext
49 lines
1.7 KiB
Plaintext
|
|
================================================================================
|
|||
|
|
BEVFusion Phase 4A Stage 1 - 环境配置总结
|
|||
|
|
================================================================================
|
|||
|
|
|
|||
|
|
【训练配置】
|
|||
|
|
训练阶段: Phase 4A Stage 1
|
|||
|
|
开始时间: 2025-11-01 09:15 UTC
|
|||
|
|
GPU配置: 8×Tesla V100S-32GB (100%满载)
|
|||
|
|
分辨率: 600×600 GT (0.167m/pixel)
|
|||
|
|
Batch: 1/GPU × 8 = 8
|
|||
|
|
Epochs: 10
|
|||
|
|
预计: 9.5天完成 (vs 4卡18天, 1.7×加速)
|
|||
|
|
|
|||
|
|
【关键路径】
|
|||
|
|
配置: configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/multitask_BEV2X_phase4a_stage1.yaml
|
|||
|
|
脚本: START_FROM_EPOCH1.sh
|
|||
|
|
输出: /data/runs/phase4a_stage1/
|
|||
|
|
日志: phase4a_stage1_new_*.log
|
|||
|
|
预训练: /data/pretrained/swint-nuimages-pretrained.pth
|
|||
|
|
初始权重: /data/runs/phase4a_stage1/epoch_1.pth
|
|||
|
|
|
|||
|
|
【环境配置】
|
|||
|
|
Python: 3.8
|
|||
|
|
PyTorch: 1.9.1+cu111
|
|||
|
|
CUDA: 12.8 (driver) / 11.1 (runtime)
|
|||
|
|
mmcv: 1.4.0
|
|||
|
|
torchpack: 分布式训练框架
|
|||
|
|
|
|||
|
|
【文档索引】
|
|||
|
|
1. project/docs/Phase4A_Stage1_8GPU配置_20251101.md - 完整8卡配置
|
|||
|
|
2. project/docs/8卡训练快速参考.md - 快速参考
|
|||
|
|
3. project/README.md - 项目总览
|
|||
|
|
4. CURRENT_8GPU_CONFIG.md - 当前配置快照
|
|||
|
|
|
|||
|
|
【快速命令】
|
|||
|
|
启动: nohup bash START_FROM_EPOCH1.sh > /tmp/train_8gpu.log 2>&1 &
|
|||
|
|
监控: nvidia-smi
|
|||
|
|
进度: tail -f $(ls -t phase4a_stage1_new_*.log | head -1) | grep Epoch
|
|||
|
|
停止: pkill -f "train.py"
|
|||
|
|
清理: bash cleanup_eval_hook.sh
|
|||
|
|
|
|||
|
|
【性能目标】
|
|||
|
|
Epoch 1 (11/2): Stop Line 0.30+, Divider 0.22+, mIoU 0.43+
|
|||
|
|
Epoch 10 (11/10): Stop Line 0.35+, Divider 0.28+, mIoU 0.48+
|
|||
|
|
|
|||
|
|
================================================================================
|
|||
|
|
生成时间: 2025-11-01 12:20 UTC
|
|||
|
|
================================================================================
|