Official implementation of
“AirV2X: Unified Air–Ground/Vehicle‑to‑Everything Collaboration for Perception”
Download AirV2X‑Perception from Hugging Face and extract it to any location:
mkdir dataset
cd dataset # Use another directory to avoid naming conflict
conda install -c conda-forge git-lfs
git lfs install --skip-smudge
git clone https://huggingface.co/datasets/xiangbog/AirV2X-Perception
cd AirV2X-Perception
git lfs pull
# git lfs pull --include "path/to/folder" # If you would like to download only partial of the dataset
We also provide a mini batch for quick testing and debugging.
Detailed instructions and environment specifications are in doc/INSTALL.md
.
python opencood/tools/train.py \
-y /path/to/config_file.yaml
Example: train Where2Comm (LiDAR‑only)
python opencood/tools/train.py \
-y opencood/hypes_yaml/airv2x/lidar/det/airv2x_intermediate_where2com.yaml
Tip
Some models such as V2X‑ViT and CoBEVT consume a large amount of VRAM.
Enable mixed‑precision with--amp
if you encounter OOM, but watch out for NaN/Inf instability.
python opencood/tools/train.py \
-y opencood/hypes_yaml/airv2x/lidar/det/airv2x_intermediate_v2xvit.yaml
--amp
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun \
--standalone --nproc_per_node=4 \
opencood/tools/train.py \
-y /path/to/config_file.yaml
Example: LiDAR‑only Where2Comm with 8 GPUs
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun \
--standalone\
--nproc_per_node=8 \
opencood/tools/train.py \
-y opencood/hypes_yaml/airv2x/lidar/det/airv2x_intermediate_where2com.yaml
These models were trained on 2 nodes × 1 GPU (batch size 1).
If you change the number of GPUs or batch size, adjust the learning rate accordingly.
python opencood/tools/inference_multi_scenario.py \
--model_dir opencood/logs/airv2x_intermediate_where2comm/default__2025_07_10_09_17_28 \
--eval_best_epoch \
--save_vis
Modality | Model | [email protected] | [email protected] | [email protected] | Config & Checkpoint |
---|---|---|---|---|---|
LiDAR | When2Com | 0.1824 | 0.0787 | 0.0025 | Coming soon |
CoBEVT | 0.4922 | 0.4585 | 0.2582 | HF | |
Where2Comm | 0.4366 | 0.4015 | 0.1538 | HF | |
V2X‑ViT | 0.4401 | 0.3821 | 0.1638 | HF |
Note
The above checkpoints were trained for 50 epochs withbatch_size=2
, so their numbers may differ slightly from the paper.
Additional Camera‑only and Multi‑modal checkpoints are on the way.
tensorboard --logdir opencood/logs --port 10000 --bind_all
@article{gao2025airv2x,
title = {AirV2X: Unified Air--Ground/Vehicle-to-Everything Collaboration for Perception},
author = {Gao, Xiangbo and Tu, Zhengzhong and others},
journal = {arXiv preprint arXiv:2506.19283},
year = {2025}
}
We will continuously update this repository with code, checkpoints, and documentation.
Feel free to open issues or pull requests — contributions are welcome! 🚀