Comprehensive NAIP-derived dataset for aircraft detection use cases using machine learning.
Important
This dataset is currently under construction. Contributors are welcome.
NAIP is a USA-wide high resolution aerial dataset with excellent image quality and detail. NAIP imagery included in the dataset has a GSD of at least 0.6m and is free from cloud or other effects. Many images in this dataset have a GSD of 0.3m. All NAIP imagery in this dataset have red, green, blue, and NIR bands.
Current public datasets for aircraft detection are coarse and lack detailed labels. This makes detection of the type of aircraft difficult.
Due to NAIP's extensive coverage throughout the United States, it is a very rich dataset across many different environments.
At the end of this project I aim to have the following:
- First-pass labeling via trained object detection model
- Automated releases and PR linting/input data validation
- The dataset up on huggingface w/ an example model available
- Notebooks available on how to download the imagery and use it (model trainig/inference)
- Github Pages for project
Stretch Goals:
- Add albumentations & modifications to make it more like satellite imagery
- Build more world-wide datasets like this
Each NAIP image bounding box contains the following features:
- category (int): Label class, see Label Classes
- obstructed (bool): True if part of the aircraft is obstructed by buildings/hangars/tile edges
- notes (str): Any notes about aircraft, usually empty.
These label classes are a rough first pass and definitely need continued massaging, improvement, and definition.
Please help me! I think I've already given myself carpal tunnel from hand labeling what we have so far. Contributing guide coming soon.
This work is licensed under a Creative Commons Attribution 4.0 International License.