Skip to content

Improve WDPA data download / access #8

@sjpfenninger

Description

@sjpfenninger

@irm-codebase in PR #1:

@sjpfenninger wile digging around in the integrated module workflow, I found a way to avoid having to force users to download the WDPA dataset on their own.

the following curl command should always download the latest version of that dataset

# Full world polygons to GeoJSON
curl -G "https://data-gis.unep-wcmc.org/server/rest/services/ProtectedSites/The_World_Database_of_Protected_Areas/FeatureServer/1/query" \
     --data-urlencode "where=1=1" \  # ask for all records
     --data-urlencode "outFields=*" \  # ask for all columns
     --data-urlencode "outSR=4326" \ # WGS84 (can be another EPSG)
     --data-urlencode "f=geojson" \  # GeoJSON, convert to GeoParquet later
     -o wdpa_poly_latest.geojson

This results in a ~180 MB download that is completed rather quickly

Another alternative is the ogr2ogr command in GDAL (not tested), which can output GeoParquet directly

# Polygons (layer 1) -> GeoParquet
ogr2ogr -f Parquet wdpa_poly_latest.parquet \
  "https://data-gis.unep-wcmc.org/server/rest/services/ProtectedSites/The_World_Database_of_Protected_Areas/FeatureServer/1" \
  -t_srs EPSG:4326 \
  -lco COMPRESSION=SNAPPY -lco GEOMETRY_ENCODING=WKB

This may warrant further investigation, but it seems like this API is limited to requesting only up to 2000 records. See https://data-gis.unep-wcmc.org/server/rest/services/ProtectedSites/The_World_Database_of_Protected_Areas/FeatureServer which states "MaxRecordCount: 2000". That would explain why the download here is ~10x smaller than the manual download of the full dataset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions