49 lines
1.3 KiB
Plaintext
49 lines
1.3 KiB
Plaintext
|
|
===========================================
|
|||
|
|
Phase 4A Stage 1 - 8卡训练配置更新
|
|||
|
|
===========================================
|
|||
|
|
|
|||
|
|
更新时间: 2025-11-01 12:20 UTC
|
|||
|
|
更新内容: 从4卡训练升级到8卡训练
|
|||
|
|
|
|||
|
|
【关键改进】
|
|||
|
|
✅ 训练速度: 1.7×加速 (9.5天 vs 18天)
|
|||
|
|
✅ GPU配置: 8×Tesla V100S-32GB
|
|||
|
|
✅ 磁盘优化: evaluation.interval: 1→5
|
|||
|
|
✅ 输出路径: work_dir指向/data分区
|
|||
|
|
|
|||
|
|
【新增文档】
|
|||
|
|
1. Phase4A_Stage1_8GPU配置_20251101.md (11KB)
|
|||
|
|
- 完整的8卡训练环境配置
|
|||
|
|
- 硬件、软件、目录配置详解
|
|||
|
|
- 启动脚本、监控命令
|
|||
|
|
- 常见问题与解决方案
|
|||
|
|
|
|||
|
|
2. 8卡训练快速参考.md (1.2KB)
|
|||
|
|
- 快速启动指南
|
|||
|
|
- 常用监控命令
|
|||
|
|
- 关键配置参数
|
|||
|
|
|
|||
|
|
【更新文档】
|
|||
|
|
1. project/README.md
|
|||
|
|
- 训练进展更新为8卡状态
|
|||
|
|
- 添加8卡配置文档索引
|
|||
|
|
- 更新时间表和监控命令
|
|||
|
|
|
|||
|
|
【配置文件】
|
|||
|
|
- START_FROM_EPOCH1.sh: torchpack -np 4→8
|
|||
|
|
- multitask_BEV2X_phase4a_stage1.yaml:
|
|||
|
|
* work_dir: /data/runs/phase4a_stage1
|
|||
|
|
* evaluation.interval: 1→5
|
|||
|
|
|
|||
|
|
【预计完成时间】
|
|||
|
|
- Epoch 1: 2025-11-02 20:00
|
|||
|
|
- Epoch 10: 2025-11-10 20:00
|
|||
|
|
|
|||
|
|
【查看方式】
|
|||
|
|
cd /workspace/bevfusion/project
|
|||
|
|
cat docs/Phase4A_Stage1_8GPU配置_20251101.md
|
|||
|
|
cat docs/8卡训练快速参考.md
|
|||
|
|
cat README.md
|
|||
|
|
|
|||
|
|
===========================================
|