Skip to content

Configuration

mapcv is driven by a single YAML config file. Use mapcv init to scaffold one with annotations, then edit it according to your needs before running mapcv generate.


Defines the geographic area of interest in WGS-84 coordinates.

FieldTypeDescription
westfloatWestern boundary longitude
southfloatSouthern boundary latitude
eastfloatEastern boundary longitude
northfloatNorthern boundary latitude
zoomintXYZ tile zoom level

Controls how tiles are fetched.

Note: Exactly one of source or url_template is required.

FieldTypeDefaultDescription
sourcestr-Built-in source name. Use instead of url_template.
url_templatestr-Custom XYZ URL with {z}, {x}, {y} placeholders. Use instead of source.
strip_rowsint4Tile rows per horizontal strip. Lower values reduce peak memory.
max_connectionsint16Parallel tile fetches. High values may trigger server rate limits.
policystrlenientstrict aborts on any failure. lenient skips failed tiles unless max_failed_ratio is exceeded. ignore skips all failures silently.
max_failed_ratiofloat0.05Abort threshold for failed tiles. Only applies when policy: lenient.

Built-in sources

NameDescription
google_satelliteGoogle Satellite
esri_satelliteEsri World Imagery
esri_topoEsri World Topographic Map
esri_streetEsri World Street Map
osmOpenStreetMap
cartodb_positronCartoDB Positron (light)
cartodb_dark_matterCartoDB Dark Matter

Omit this section entirely for image-only (unlabeled) datasets.

FieldTypeDefaultDescription
pathstr-Path to a .kml or .geojson annotation file.
label_fieldstrnullProperty name used as the class label. IDs are assigned by encounter order starting at 1. null assigns class 1 to all polygons. For multiclass, set this to the GeoJSON/KML property that holds the class name. For example, label_field: "class" when features have "class": "residential", "class": "water", etc.
all_touchedboolfalseInclude all pixels the polygon boundary touches, not just centres. Useful for narrow polygons.

Controls how fixed-size patches are extracted from the stitched image.

FieldTypeDefaultDescription
patch_sizeint-Side length of each square patch in pixels.
strideint0Step between patch centres. 0 = same as patch_size. Lower values produce overlapping patches.
modestrgridgrid scans row-by-row. random draws positions randomly.
edge_strategystrpadpad fills partial edge patches to full size. drop discards them. shift slides the last patch inward to fit.
pad_modestrzerozero = black fill, reflect = mirror border. Only applies when edge_strategy: pad.
max_empty_ratiofloat1.0Drop patches exceeding this fraction of black pixels. 1.0 keeps all.
min_label_ratiofloat0.0Drop patches below this fraction of labeled pixels. 0.0 keeps all.
random_countint100Number of patches to draw. Only used when mode: random.
random_seedint42Random seed for reproducibility. Only used when mode: random.

FieldTypeDefaultDescription
staging_dirstr-Directory where images, masks, and the manifest will be written.
image_formatstrpngpng is lossless. jpg is smaller but lossy.
jpg_qualityint95JPEG quality from 1 (smallest) to 100 (best). Only used when image_format: jpg.

Omit to skip splitting. When present, mapcv writes train.txt, val.txt, and test.txt into staging_dir/splits/.

FieldTypeDefaultDescription
test_ratiofloat0.20Fraction of all patches reserved for the test set.
val_ratiofloat0.10Fraction of the remaining patches used for validation.
labeled_ratioslist[float][0.10, 0.20, 0.30]For each ratio, creates <pct>/labeled.txt and <pct>/unlabeled.txt inside staging_dir, where <pct> is the ratio as an integer percentage (e.g. 10, 20, 30). For semi-supervised workflows.
seedint42Random seed for reproducible splits.
strategystrstratifiedstratified preserves class distribution across splits. random splits without balancing.
sample_limitint-Cap on total patches used before splitting. Useful for large datasets.