Create a raster mask using a shapefile

What does this notebook do?

This notebook demonstrates how to use a shapefile to mask data extracted from DEA. This notebook imports some Landsat 8 data, generates a shapefile mask, and applies it to the extracted Landsat data.


You need to run the following commands from the command line prior to launching jupyter notebooks from the same terminal so that the required libraries and paths are set:

module use /g/data/v10/public/modules/modulefiles

module load dea

If you find an error or bug in this notebook, please either create an ‘Issue’ in the Github repository, or fix it yourself and create a ‘Pull’ request to contribute the updated notebook back into the repository (See the repository README for instructions on creating a Pull request).

  • A shape file containing the polygon you would like to use for the analysis. Here we use the ACT reserves shapefile from Before you use this notebook, put your polygon somewhere accessible and update the cell with the shape_file = path.

This example uses two external functions called three_band_image and load_nbarx. These functions are available in the 10_Scripts folder of the dea-notebooks Github repository. Note that these functions have been developed by DEA users, not the DEA development team, and so are provided without warranty.

Date: August 2018

Author: Claire Krause

Tags: plot, three_band_image, Landsat8, DEAPlotting, shapefile, rasterio, Scripts, DEADataHandling, load_nbarx, geopandas, masking

%pylab notebook

from datacube import Datacube

import xarray as xr
import rasterio
import rasterio.features
import geopandas as gp

# Import the custom scripts. These can be found in the dea-notebooks repository.
import sys
import os.path
import DEAPlotting
import DEADataHandling

dc = Datacube(app='raster mask')
Populating the interactive namespace from numpy and matplotlib

Set up the extraction query

query = {'lat': (-35.25, -35.35),
         'lon': (149.05, 149.17),
         'time': ('2016-01-01', '2016-03-01')
data, crs, affine = DEADataHandling.load_nbarx(dc, 'ls8', query,
                                               bands_of_interest=['swir1', 'nir', 'green'])
Loading ls8_nbart_albers
Loaded ls8_nbart_albers
Generating mask ls8_pq_albers

Read in the shapefile you would like to use to mask the Landsat data

For this example, we use the polygon dataset of ACT Reserves.

shape_file = os.path.expanduser('~/dea-notebooks/03_Integrating_external_data/Files/ACT_Reserves.shp')
shp = gp.read_file(shape_file)

# Reproject coordinate system of shape file
shp = shp.to_crs({'init': 'epsg:3577'})

Iterate over each of the polygons, and put a ‘1’ in the mask raster that corresponds to the footprint of each polygon

mask = rasterio.features.rasterize(((feature['geometry'], 1) for feature in shp.iterfeatures()),
# Convert the numpy array to a DataArray
mask = xr.DataArray(mask, coords=(data.y, data.x))
<matplotlib.collections.QuadMesh at 0x7f1c9a5842e8>

Apply the mask to the Landsat data and plot

# Data * mask = data or nan
MaskedData = data.where(mask)
DEAPlotting.three_band_image(MaskedData, bands = ['swir1', 'nir', 'green'], time = 5, title = 'ACT Reserves');