Introduction
mapcv turns a bounding box and a set of polygon labels into a ready-to-train satellite imagery dataset for semantic segmentation. You give it a geographic region, a tile source, and optionally a KML or GeoJSON annotation file. It fetches the tiles, rasterizes your labels onto the image grid, extracts fixed-size patches, and writes everything to disk.
To ensure a lightweight footprint and easy installation, the tool is GDAL-free. Tile stitching, rasterization, and patch sampling are powered by Rust, while the Python layer handles orchestration, configuration, and access via the CLI or API.
Key concepts
Section titled “Key concepts”- Bounding box: The target area in WGS-84 (longitude/latitude). mapcv expands these coordinates outward to align with the nearest tile grid so no downloaded tiles are partially cut off.
- Zoom level: Controls resolution. Higher zoom means more tiles and more detail. For example, zoom 19 has about 4x more tiles than zoom 17 for the same area.
- Strips: The tile grid is divided into horizontal bands of
strip_rowstile rows. Processed and written one strip at a time to cap memory use. - Patches: Fixed-size square crops extracted from the stitched image. Each patch has a matching mask file of the same name.
- Class IDs: Mask pixels are
0(background) or1..255(class). Whenlabel_fieldisnull, all polygons getclass_id = 1. When a field name is given, IDs are assigned by encounter order starting at 1.
When to use mapcv
Section titled “When to use mapcv”- You have polygon annotations (KML or GeoJSON) and want pixel-level segmentation masks.
- You want image-only patches (unlabeled) — just omit the
labelssection. - You need a reproducible pipeline that goes from a config file to a ready-to-train dataset in one command.
Quick start guide
Section titled “Quick start guide”Get up and running with mapcv and generate your first dataset in three simple steps.
1. Install mapcv
Section titled “1. Install mapcv”Install the latest version of mapcv using pip.
pip install mapcv2. Configure your dataset
Section titled “2. Configure your dataset”Generate a template configuration file and customize it to fit your needs:
mapcv init my_dataset.yamlNote: For details on all available settings, check out the Configuration Reference page.
3. Generate the dataset
Section titled “3. Generate the dataset”Once your configuration file is ready, run the following command to start the generation process:
mapcv generate my_dataset.yamlThat’s it! Your generated dataset will be saved to the output directory you specified in your configuration file.
Citation
Section titled “Citation”If you use mapcv in your research, please consider citing it.
(Citation information will be added after publication.)