GeoAI Arctic Mapping Challenge Dataset
The GeoAI Arctic Mapping Challenge dataset builds upon Yang et al. (2023) and focuses on detecting and mapping retrogressive thaw slumps (RTS)—landscape disturbances caused by permafrost thaw. For this competition, we extend and reformat the dataset into an instance segmentation benchmark, enabling participants to train models that can better delineate individual RTS features across diverse Arctic regions.
Why it matters: RTS are sensitive indicators of permafrost thaw, which releases greenhouse gases and alters Arctic landscapes. By leveraging AI, we aim to accelerate RTS detection and improve understanding of climate-driven change.
Geographic Coverage & Study Sites
The dataset spans 7 Arctic subregions, including:
- Canada: Herschel Island, Horton Delta, Tuktoyaktuk peninsulas, Banks Island
- Russia: Yamal and Gydan peninsulas, Lena River, Kolguev Island
Figure 1. Spatial coverage of the GeoAI Arctic RTS dataset. The dataset includes 7 Arctic subregions across Canada and Russia, representing diverse geomorphic and climatic conditions (Li et al., 2025).
Data Sources & Multimodal Inputs
This dataset integrates multi-source satellite and geospatial data:
Data Type | Source | Resolution | Band Names | Purpose in Task |
---|---|---|---|---|
RGB Imagery | Maxar | 4 m | maxarR, maxarG, maxarB | High-resolution base imagery for visual recognition |
Multi-spectral | Sentinel-2 | 10 m | NDVI, NDWI, NIR | Vegetation & water indices for spectral feature learning |
Elevation | ArcticDEM | 2 m | relative elevation, shaded relief | Topographic context to improve RTS boundary detection |
Annotation Strategy & Task Setup
Originally, Yang et al. (2023) provided semantic segmentation masks - binary labels indicating RTS vs. non-RTS regions.
For this challenge, we converted the dataset into instance segmentation format so each RTS feature is labeled individually.
Satellite Image (RGB) | Semantic Mask (Original) | Instance Mask (This Challenge) |
---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Figure 2. Conversion from semantic to instance segmentation masks. Original semantic masks from Yang et al. (2023) labeled RTS vs. non-RTS, while the challenge dataset uses instance-level masks, enabling finer-grained evaluation and model learning.
Key Dataset Statistics
Property | Description |
---|---|
Total Regions | 7 Arctic subregions |
Total Images | 756 train + 138 test |
Total RTS Instances | 2,110 |
Imagery Resolution | Maxar 4 m, Sentinel-2 10 m, ArcticDEM 2 m |
Spectral Bands | RGB + NDVI, NDWI, NIR + DEM |
Task | Instance segmentation |
Labels | Per-instance RTS masks |
File Formats | .npz images + .json COCO-style annotations |
Original Source | Yang et al. (2023), Remote Sensing of Environment |
RTS Size Distribution
RTS Coverage Distribution
RTS Count Distribution
RTS Shape Analysis
Band Statistics
Sample Visualizations
Explore additional examples to understand data variability across regions.
Description | Satellite Image (RGB) | Instance Mask (RTS Features) |
---|---|---|
Single large RTS | ![]() |
![]() |
Single small RTS | ![]() |
![]() |
Multiple RTS | ![]() |
![]() |
RTS near snow | ![]() |
![]() |
Figure 3. Examples of RGB imagery with RTS instance annotations. Visualizing the dataset’s variability across scales and landscapes.