.. note:: This tutorial was generated from an IPython notebook that can be downloaded `here <../../notebooks/method.ipynb>`_. .. _method: Edge behavior and interiors =========================== This notebook illustrates the edge behavior and how Polygon interiors are treated. .. note:: From version 0.5 ``regionmask`` treats points on the region borders differently and also considers poygon interiors (holes), e.g. the Caspian Sea in ``natural_earth.land_110`` region. Preparation ----------- Import regionmask and check the version: .. code:: python import regionmask regionmask.__version__ .. parsed-literal:: '0.5.0+dev' Other imports .. code:: python import xarray as xr import numpy as np import cartopy.crs as ccrs import matplotlib.pyplot as plt from matplotlib import colors as mplc from shapely.geometry import Polygon Define some colors: .. code:: python cmap1 = mplc.ListedColormap(["#9ecae1"]) cmap2 = mplc.ListedColormap(["#fc9272"]) cmap3 = mplc.ListedColormap(["#cab2d6"]) Methods ------- Regionmask offers three methods to rasterize regions 1. ``rasterize``: fastest but only for equally-spaced grid, uses ``rasterio.features.rasterize`` internally. 2. ``shapely``: for irregular grids, uses ``shapely.vectorized.contains`` internally. 3. ``legacy``: old method (deprecated), slowest and with inconsistent edge behaviour All methods use the ``lon`` and ``lat`` coordinates to determine if a grid cell is in a region. ``lon`` and ``lat`` are assumed to indicate the *center* of the grid cell. Methods (1) and (2) have the same edge behavior and consider ‘holes’ in the regions. Method (3) is deprecated and will be removed in a future version. ``regionmask`` automatically determines which ``method`` to use. (2) subtracts a tiny offset from ``lon`` and ``lat`` to achieve a edge behaviour consistent with (1). Due to `mapbox/rasterio/#1844 `__ this is unfortunately also necessary for (1). Edge behavior ------------- As of version 0.5 ``regionmask`` has a new edge behavior - points that fall of the outline of a region are now consistently treated. This was not the case in earlier versions (xref `matplotlib/matplotlib#9704 `__). It’s easiest to see the edge behaviour in an Example ~~~~~~~ Define a region and a lon/ lat grid, such that some gridpoints lie exactly on the border: .. code:: python outline = np.array([[-80.0, 50.0], [-80.0, 28.0], [-100.0, 28.0], [-100.0, 50.0]]) region = regionmask.Regions([outline]) .. code:: python ds_US = regionmask.core.utils.create_lon_lat_dataarray_from_bounds( *(-161, -29, 2), *(75, 13, -2) ) print(ds_US) .. parsed-literal:: Dimensions: (lat: 30, lat_bnds: 31, lon: 65, lon_bnds: 66) Coordinates: * lon (lon) float64 -160.0 -158.0 -156.0 -154.0 ... -36.0 -34.0 -32.0 * lat (lat) float64 74.0 72.0 70.0 68.0 66.0 ... 22.0 20.0 18.0 16.0 * lon_bnds (lon_bnds) int64 -161 -159 -157 -155 -153 ... -39 -37 -35 -33 -31 * lat_bnds (lat_bnds) int64 75 73 71 69 67 65 63 61 ... 27 25 23 21 19 17 15 LON (lat, lon) float64 -160.0 -158.0 -156.0 ... -36.0 -34.0 -32.0 LAT (lat, lon) float64 74.0 74.0 74.0 74.0 ... 16.0 16.0 16.0 16.0 Data variables: *empty* Let’s create a mask with each of these methods: .. code:: python mask_rasterize = region.mask(ds_US, method="rasterize") mask_shapely = region.mask(ds_US, method="shapely") mask_legacy = region.mask(ds_US, method="legacy") .. note:: ``regionmask`` automatically detects which method to use, so there is no need to specify the ``method`` keyword. Plot the masked regions: .. code:: python f, axes = plt.subplots(1, 3, subplot_kw=dict(projection=ccrs.PlateCarree())) opt = dict(add_colorbar=False, ec="0.5", lw=0.5) mask_rasterize.plot(ax=axes[0], cmap=cmap1, **opt) mask_shapely.plot(ax=axes[1], cmap=cmap2, **opt) mask_legacy.plot(ax=axes[2], cmap=cmap3, **opt) for ax in axes: ax = region.plot_regions(ax=ax, add_label=False) ax.set_extent([-105, -75, 25, 55], ccrs.PlateCarree()) ax.coastlines(lw=0.5) ax.plot( ds_US.LON, ds_US.lat, "*", color="0.5", ms=0.5, transform=ccrs.PlateCarree() ) axes[0].set_title("rasterize") axes[1].set_title("shapely") axes[2].set_title("legacy"); .. image:: method_files/method_17_0.png Points indicate the grid cell centers (``lon`` and ``lat``), lines the grid cell borders, colored grid cells are selected to be part of the region. The top and right grid cells now belong to the region while the left and bottom grid cells do not. This choice is arbitrary but mimicks what ``rasterio.features.rasterize`` does. This can avoid spurios columns of unassigned grid poins as the following example shows. SREX regions ~~~~~~~~~~~~ Create a global dataset: .. code:: python ds_GLOB = regionmask.core.utils.create_lon_lat_dataarray_from_bounds( *(-180, 181, 2), *(90, -91, -2) ) .. code:: python srex = regionmask.defined_regions.srex srex_new = srex.mask(ds_GLOB) srex_old = srex.mask(ds_GLOB, method="legacy") .. code:: python f, axes = plt.subplots(1, 2, subplot_kw=dict(projection=ccrs.PlateCarree())) opt = dict(add_colorbar=False, cmap="viridis_r") srex_new.plot(ax=axes[0], **opt) srex_old.plot(ax=axes[1], **opt) for ax in axes: srex.plot_regions(ax=ax, add_label=False, line_kws=dict(lw=0.5)) ax.set_extent([-150, -50, 15, 75], ccrs.PlateCarree()) ax.coastlines(resolution="50m", lw=0.25) axes[0].set_title("new (rasterize + shapely)") axes[1].set_title("legacy"); .. image:: method_files/method_21_0.png Polygon interiors ----------------- ``Polygons`` can have interior boundaries (‘holes’). Previously these were not considered and e.g. the Caspian Sea was not ‘unmasked’. Example ~~~~~~~ Let’s test this on an example and define a ``region_with_hole``: .. code:: python interior = np.array( [[-86.0, 44.0], [-86.0, 34.0], [-94.0, 34.0], [-94.0, 44.0], [-86.0, 44.0],] ) poly = Polygon(outline, [interior]) region_with_hole = regionmask.Regions([poly]) .. code:: python mask_hole_rasterize = region_with_hole.mask(ds_US, method="rasterize") mask_hole_shapely = region_with_hole.mask(ds_US, method="shapely") mask_hole_legacy = region_with_hole.mask(ds_US, method="legacy") .. code:: python f, axes = plt.subplots(1, 3, subplot_kw=dict(projection=ccrs.PlateCarree())) opt = dict(add_colorbar=False, ec="0.5", lw=0.5) mask_hole_rasterize.plot(ax=axes[0], cmap=cmap1, **opt) mask_hole_shapely.plot(ax=axes[1], cmap=cmap2, **opt) mask_hole_legacy.plot(ax=axes[2], cmap=cmap3, **opt) for ax in axes: region.plot_regions(ax=ax, add_label=False) # interiors are not (yet) ploted by default ax.plot(*interior.T, color="k") ax.set_extent([-105, -75, 25, 55], ccrs.PlateCarree()) ax.coastlines(lw=0.5) ax.plot( ds_US.LON, ds_US.lat, ".", color="0.5", ms=0.5, transform=ccrs.PlateCarree() ) axes[0].set_title("rasterize") axes[1].set_title("shapely") axes[2].set_title("legacy"); .. image:: method_files/method_25_0.png Caspian Sea ~~~~~~~~~~~ .. code:: python land110 = regionmask.defined_regions.natural_earth.land_110 land_new = land110.mask(ds_GLOB) land_old = land110.mask(ds_GLOB, method="legacy") .. code:: python f, axes = plt.subplots(1, 2, subplot_kw=dict(projection=ccrs.PlateCarree())) opt = dict(add_colorbar=False) land_new.plot(ax=axes[0], cmap=cmap2, **opt) land_old.plot(ax=axes[1], cmap=cmap3, **opt) for ax in axes: ax.set_extent([15, 75, 15, 55], ccrs.PlateCarree()) ax.coastlines(resolution="50m", lw=0.5) ax.plot( ds_GLOB.LON, ds_GLOB.lat, ".", color="0.5", ms=0.5, transform=ccrs.PlateCarree() ) axes[0].set_title("new (rasterize + shapely)") axes[1].set_title("legacy"); .. image:: method_files/method_28_0.png Speedup ------- The new methods are faster than the old one: .. code:: python print("Method: rasterize") %timeit -n 10 region.mask(ds_US, method="rasterize") print("Method: shapely") %timeit -n 10 region.mask(ds_US, method="shapely") print("Method: legacy") %timeit -n 10 region.mask(ds_US, method="legacy") .. parsed-literal:: Method: rasterize 1.31 ms ± 73.9 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) Method: shapely 979 µs ± 39.5 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) Method: legacy 1.82 ms ± 118 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) While there is not a big difference for this simple example, the difference gets larger for more complex geometries and more gridpoints: .. code:: python ds_GLOB = regionmask.core.utils.create_lon_lat_dataarray_from_bounds( *(-180, 181, 2), *(90, -91, -2) ) countries_110 = regionmask.defined_regions.natural_earth.countries_110 .. code:: python print("Method: rasterize") %timeit -n 1 countries_110.mask(ds_GLOB, method="rasterize") .. parsed-literal:: Method: rasterize 14.4 ms ± 2.19 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) .. code:: python print("Method: shapely") %timeit -n 1 countries_110.mask(ds_GLOB, method="shapely") .. parsed-literal:: Method: shapely 254 ms ± 9.75 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) .. code:: python print("Method: legacy") %timeit -n 1 countries_110.mask(ds_GLOB, method="legacy") .. parsed-literal:: Method: legacy 2.13 s ± 142 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)