#!/bin/bash # BEVFusion Docker环境检查 export PATH=/opt/conda/bin:$PATH echo "==========================================" echo "BEVFusion Docker环境检查" echo "==========================================" echo "" PASS=0 FAIL=0 check_pass() { echo " ✓ $1" ((PASS++)) } check_fail() { echo " ✗ $1" ((FAIL++)) } # 1. Python和PyTorch echo "[1/10] Python和PyTorch" PYTHON_VERSION=$(python --version 2>&1) check_pass "$PYTHON_VERSION" PYTORCH_INFO=$(python -c "import torch; print(f'PyTorch {torch.__version__}, CUDA: {torch.cuda.is_available()}, GPUs: {torch.cuda.device_count()}')" 2>&1) check_pass "$PYTORCH_INFO" # 2. torchpack echo "" echo "[2/10] torchpack" python -c "import torchpack; print('torchpack installed')" 2>/dev/null if [ $? -eq 0 ]; then check_pass "torchpack 已安装" else check_fail "torchpack 未安装" fi # 3. mmcv echo "" echo "[3/10] mmcv" MMCV_VERSION=$(python -c "import mmcv; print(mmcv.__version__)" 2>/dev/null) if [ $? -eq 0 ]; then check_pass "mmcv $MMCV_VERSION" else check_fail "mmcv 未安装" fi # 4. mmdet echo "" echo "[4/10] mmdetection" MMDET_VERSION=$(python -c "import mmdet; print(mmdet.__version__)" 2>/dev/null) if [ $? -eq 0 ]; then check_pass "mmdetection $MMDET_VERSION" else check_fail "mmdetection 未安装" fi # 5. nuscenes-devkit echo "" echo "[5/10] nuscenes-devkit" python -c "from nuscenes import NuScenes" 2>/dev/null if [ $? -eq 0 ]; then check_pass "nuscenes-devkit 已安装" else check_fail "nuscenes-devkit 未安装" fi # 6. 其他依赖 echo "" echo "[6/10] 其他Python依赖" python -c "import tqdm" 2>/dev/null && check_pass "tqdm" || check_fail "tqdm" python -c "from PIL import Image" 2>/dev/null && check_pass "Pillow" || check_fail "Pillow" python -c "import numpy" 2>/dev/null && check_pass "numpy" || check_fail "numpy" # 7. 自定义算子 echo "" echo "[7/10] 自定义CUDA算子" python -c "from mmdet3d.ops import bev_pool_v2" 2>/dev/null && check_pass "BEV Pool" || check_fail "BEV Pool (需要: python setup.py develop)" python -c "from mmdet3d.ops import Voxelization" 2>/dev/null && check_pass "Voxelization" || check_fail "Voxelization" python -c "from mmdet3d.ops.spconv import SparseConv3d" 2>/dev/null && check_pass "Sparse Conv" || check_fail "Sparse Conv" # 8. GPU信息 echo "" echo "[8/10] GPU信息" nvidia-smi --query-gpu=index,name,memory.total,memory.free --format=csv,noheader | while read line; do check_pass "GPU $line" done # 9. 数据集 echo "" echo "[9/10] nuScenes数据集" if [ -d "data/nuscenes" ]; then check_pass "数据集目录存在" [ -f "data/nuscenes/nuscenes_infos_train.pkl" ] && check_pass "训练info文件" || check_fail "训练info文件缺失" [ -f "data/nuscenes/nuscenes_infos_val.pkl" ] && check_pass "验证info文件" || check_fail "验证info文件缺失" else check_fail "数据集目录不存在" fi # 10. 预训练模型 echo "" echo "[10/10] 预训练模型" if [ -d "pretrained" ]; then check_pass "预训练目录存在" [ -f "pretrained/swint-nuimages-pretrained.pth" ] && check_pass "SwinT预训练模型" || check_fail "SwinT预训练模型缺失" [ -f "pretrained/lidar-only-det.pth" ] && check_pass "LiDAR检测模型" || check_fail "LiDAR检测模型缺失" else check_fail "预训练目录不存在" fi # 总结 echo "" echo "==========================================" echo "检查完成: 通过 $PASS 项, 失败 $FAIL 项" echo "==========================================" if [ $FAIL -eq 0 ]; then echo "" echo "✅ 环境完整,可以开始训练!" echo "" echo "推荐训练命令:" echo "" echo "1. 3D检测训练(8 GPU):" echo " torchpack dist-run -np 8 python tools/train.py \\" echo " configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml \\" echo " --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth \\" echo " --load_from pretrained/lidar-only-det.pth" echo "" echo "2. BEV分割训练(8 GPU):" echo " torchpack dist-run -np 8 python tools/train.py \\" echo " configs/nuscenes/seg/fusion-bev256d2-lss.yaml \\" echo " --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth" echo "" echo "3. 多任务训练(检测+分割,8 GPU):" echo " torchpack dist-run -np 8 python tools/train.py \\" echo " configs/nuscenes/multitask/fusion-det-seg-swint.yaml \\" echo " --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth \\" echo " --load_from pretrained/lidar-only-det.pth" exit 0 else echo "" echo "⚠️ 发现 $FAIL 个问题需要解决" exit 1 fi