Extracting contour lines

  • Compatability: Notebook currently compatible with both the NCI and DEA Sandbox environments

  • Products used: s2b_ard_granule

Background

Contour extraction is a fundamental image processing method used to detect and extract the boundaries of spatial features. While contour extraction has traditionally been used to precisely map lines of given elevation from digital elevation models (DEMs), contours can also be extracted from any other array-based data source. This can be used to support remote sensing applications where the position of a precise boundary needs to be mapped consistently over time, such as extracting dynamic waterline boundaries from satellite-derived Normalized Difference Water Index (NDWI) data.

Description

This notebook demonstrates how to use the subpixel_contours function based on tools from skimage.measure.find_contours to:

  1. Extract one or multiple contour lines from a single two-dimensional digital elevation model (DEM) and export these as a shapefile

  2. Optionally include custom attributes in the extracted contour features

  3. Load in a multi-dimensional satellite dataset from Digital Earth Australia, and extract a single contour value consistently through time along a specified dimension

  4. Filter the resulting contours to remove small noisy features


Getting started

To run this analysis, run all the cells in the notebook, starting with the “Load packages” cell.

Load packages

[1]:
%matplotlib inline

import sys
import datacube
import numpy as np
import pandas as pd
import xarray as xr
import matplotlib as mpl
import matplotlib.pyplot as plt
from affine import Affine
from datacube.utils.geometry import CRS

sys.path.append('../Scripts')
from dea_datahandling import load_ard
from dea_datahandling import download_unzip
from dea_spatialtools import subpixel_contours
from dea_bandindices import calculate_indices

Connect to the datacube

[2]:
dc = datacube.Datacube(app='Contour_extraction')

Load elevation data

To demonstrate contour extraction, we first need to obtain an elevation dataset. Here we download a tile of 90 m (3 arc seconds) resolution Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) data for a region that includes Canberra using the download_unzip function from dea_datahandling. This data comes from the CGIAR - Consortium for Spatial Information (CGIAR-CSI).

After we download the data, we load it in xarray.open_rasterio and modify the projection system and geotransform attributes so it is consistent was data loaded from Digital Earth Australia.

[3]:
# Download and unzip the elevation data
download_unzip(url='http://srtm.csi.cgiar.org/wp-content/uploads/files/srtm_5x5/TIFF/srtm_66_20.zip')

# Read in the elevation data from file
elevation_array = xr.open_rasterio('srtm_66_20.tif').squeeze('band')

# Modify CRS and transform attributes into a format that is
# consistent with data loaded from `dc.load`
elevation_array.attrs['crs'] = CRS(elevation_array.crs[-9:])
elevation_array.attrs['transform'] = Affine(*elevation_array.transform)

# Print the data
print(elevation_array)
Downloading srtm_66_20.zip
Unzipping output files to: /home/jovyan/dev/dea-notebooks/Frequently_used_code
<xarray.DataArray (y: 6000, x: 6000)>
[36000000 values with dtype=int16]
Coordinates:
    band     int64 1
  * y        (y) float64 -35.0 -35.0 -35.0 -35.0 ... -40.0 -40.0 -40.0 -40.0
  * x        (x) float64 145.0 145.0 145.0 145.0 ... 150.0 150.0 150.0 150.0
Attributes:
    transform:   | 0.00, 0.00, 145.00|\n| 0.00,-0.00,-35.00|\n| 0.00, 0.00, 1...
    crs:         epsg:4326
    res:         (0.0008333333333333334, 0.0008333333333333334)
    is_tiled:    0
    nodatavals:  (-32768.0,)

We can plot a subset of the elevation data for a region around Canberra using a custom terrain-coloured colour map:

[4]:
# Create a custom colourmap for the DEM
colors_terrain = plt.cm.gist_earth(np.linspace(0.0, 1.5, 100))
cmap_terrain = mpl.colors.LinearSegmentedColormap.from_list('terrain',
                                                            colors_terrain)

# Plot a subset of the elevation data
elevation_array.sel(x=slice(149.00, 149.20),
                    y=slice(-35.20, -35.40)).plot(size=8, cmap=cmap_terrain)

[4]:
<matplotlib.collections.QuadMesh at 0x7f10b6ee1d30>
../../_images/notebooks_Frequently_used_code_Contour_extraction_11_1.png

Contour extraction in ‘single array, multiple z-values’ mode

The dea_spatialtools.subpixel_contours function uses skimage.measure.find_contours to extract contour lines from an array. This can be an elevation dataset like the data imported above, or any other two-dimensional or multi-dimensional array. We can extract contours from the elevation array imported above by providing a single z-value (e.g. elevation) or a list of z-values.

Extracting a single contour

Here, we extract a single 600 m elevation contour:

[5]:
# Extract contours
contours_gdf = subpixel_contours(da=elevation_array,
                                 z_values=600)

# Print output
contours_gdf
Operating in multiple z-value, single array mode
[5]:
z_value geometry
0 600 MULTILINESTRING ((148.03014 -35.00042, 148.029...

This returns a geopandas.GeoDataFrame containing a single contour line feature with the z-value (i.e. elevation) given in a shapefile field named z_value. We can plot contour this for the Canberra subset:

[6]:
elevation_array.sel(x=slice(149.00, 149.20),
                    y=slice(-35.20, -35.40)).plot(size=8, cmap=cmap_terrain)
contours_gdf.plot(ax=plt.gca(), linewidth=1.0, color='black')

[6]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f1111fecdd8>
../../_images/notebooks_Frequently_used_code_Contour_extraction_15_1.png

Extracting multiple contours

We can easily import multiple contours from a single array by supplying a list of z-values to extract. The function will then extract a contour for each value in z_values, skipping any contour elevation that is not found in the array.

[7]:
# List of elevations to extract
z_values = [500, 550, 600, 650, 700]

# Extract contours
contours_gdf = subpixel_contours(da=elevation_array,
                                 z_values=z_values)

# Plot extracted contours over the DEM
elevation_array.sel(x=slice(149.00, 149.20),
                    y=slice(-35.20, -35.40)).plot(size=8, cmap=cmap_terrain)
contours_gdf.plot(ax=plt.gca(), linewidth=1.0, color='black')
Operating in multiple z-value, single array mode
[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f1111ee31d0>
../../_images/notebooks_Frequently_used_code_Contour_extraction_17_2.png

Custom shapefile attributes

By default, the shapefile includes a single z_value attribute field with one feature per input value in z_values. We can instead pass custom attributes to the output shapefile using the attribute_df parameter. For example, we might want a custom column called elev_cm with heights in cm instead of m, and a location column giving the location (Australian Capital Territory or ACT).

We can achieve this by first creating a pandas.DataFrame with column names giving the name of the attribute we want included with our contour features, and one row of values for each number in z_values:

[8]:
# Elevation values to extract
z_values = [550, 600, 650]

# Set up attribute dataframe (one row per elevation value above)
attribute_df = pd.DataFrame({'elev_cm': [55000, 60000, 65000],
                             'location': ['ACT', 'ACT', 'ACT']})

# Print output
attribute_df.head()
[8]:
elev_cm location
0 55000 ACT
1 60000 ACT
2 65000 ACT

We can now extract contours, and the resulting contour features will include the attributes we created above:

[9]:
# Extract contours with custom attribute fields:
contours_gdf = subpixel_contours(da=elevation_array,
                                 z_values=z_values,
                                 attribute_df=attribute_df)

# Print output
contours_gdf.head()
Operating in multiple z-value, single array mode
[9]:
z_value elev_cm location geometry
0 550 55000 ACT MULTILINESTRING ((148.04083 -35.00042, 148.040...
1 600 60000 ACT MULTILINESTRING ((148.03014 -35.00042, 148.029...
2 650 65000 ACT MULTILINESTRING ((148.03792 -35.00479, 148.037...

Exporting contours to file

To export the resulting contours to file, use the output_path parameter. The function supports two output file formats:

  1. GeoJSON (.geojson)

  2. Shapefile (.shp)

[10]:
subpixel_contours(da=elevation_array,
                  z_values=z_values,
                  output_path='output_contours.geojson')
Operating in multiple z-value, single array mode
Writing contours to output_contours.geojson
[10]:
z_value geometry
0 550 MULTILINESTRING ((148.04083 -35.00042, 148.040...
1 600 MULTILINESTRING ((148.03014 -35.00042, 148.029...
2 650 MULTILINESTRING ((148.03792 -35.00479, 148.037...

Contours from non-elevation datasets in in ‘single z-value, multiple arrays’ mode

As well as extracting multiple contours from a single two-dimensional array, subpixel_contours also allows you to extract a single z-value from every array along a specified dimension in a multi-dimensional array. This can be useful for comparing the changes in the landscape across time. The input multi-dimensional array does not need to be elevation data: contours can be extracted from any type of data.

For example, we can use the function to extract the boundary between land and water (for a more in-depth analysis using contour extraction to monitor coastal erosion, see this notebook). First, we will load in a time series of Sentinel-2 imagery and calculate a simple Normalized Difference Water Index (NDWI) on two images. This index will have high values where a pixel is likely to be open water (e.g. NDWI > 0, coloured in blue below):

[11]:
# Set up a datacube query to load data for
query = {'lat': (-35.27, -35.33),
         'lon': (149.09, 149.15),
         'time': ('2018-01-01', '2018-02-25'),
         'measurements': ['nbart_green', 'nbart_nir_1'],
         'output_crs': 'EPSG:3577',
         'resolution': (-20, 20)}

# Load Sentinel 2 data
s2_ds = dc.load(product='s2b_ard_granule',
                group_by='solar_day',
                **query)

# Calculate NDWI on the first and last image in the dataset
s2_ds = calculate_indices(s2_ds.isel(time=[0, -1]),
                          index='NDWI',
                          collection='ga_s2_1')

# Plot the two images side by size
s2_ds.NDWI.plot(col='time', cmap='RdYlBu', size=5)

[11]:
<xarray.plot.facetgrid.FacetGrid at 0x7f110fb6ebe0>
../../_images/notebooks_Frequently_used_code_Contour_extraction_25_1.png

We can now identify the land-water boundary by extracting the 0 NDWI contour for each array in the dataset along the time dimension. By plotting the resulting contour lines for a zoomed in area, we can then start to compare phenomenon like lake levels across time.

Note: Note that unlike in previous examples, we need to manually specify crs amd affine parameters when we run the function. This is because the s2_ds.NDWI array we are analysing does not have a crs attribute (unlike s2_ds which does). Another way to get around this issue would be to assign a crs attribute to s2_ds.NDWI before we attempt to extract contours (e.g. s2_ds.NDWI.attrs['crs'] = s2_ds.crs)

[12]:
# Extract the 0 NDWI contour from both timesteps in the NDWI data
contours_gdf = subpixel_contours(da=s2_ds.NDWI,
                                 z_values=0,
                                 crs=s2_ds.crs,
                                 affine=s2_ds.geobox.transform,
                                 dim='time')

# Plot contours over the top of array
s2_ds.NDWI.sel(time='2018-01-05',
               x=slice(1549100, 1549600),
               y=slice(-3959100, -3959600)).plot(size=8, cmap='RdYlBu')
contours_gdf.plot(ax=plt.gca(), linewidth=1.5, color='black')

Operating in single z-value, multiple arrays mode
[12]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f109e9863c8>
../../_images/notebooks_Frequently_used_code_Contour_extraction_27_2.png

Dropping small contours

Contours produced by subpixel_contours can include many small features. We can optionally choose to extract only contours larger than a certain number of vertices using the min_vertices parameter. This can be useful for focusing on large contours, and remove possible noise in a dataset. Here we set min_vertices=10 to keep only contours with at least 10 vertices. Observe the small waterbody in the bottom-left of the image disappear:

[13]:
# Extract the 0 NDWI contour from both timesteps in the NDWI data
contours_gdf = subpixel_contours(da=s2_ds.NDWI,
                                 z_values=0,
                                 crs=s2_ds.crs,
                                 affine=s2_ds.geobox.transform,
                                 min_vertices=10)

# Plot contours over the top of array
s2_ds.NDWI.sel(time='2018-01-05',
               x=slice(1549100, 1549600),
               y=slice(-3959100, -3959600)).plot(size=8, cmap='RdYlBu')
contours_gdf.plot(ax=plt.gca(), linewidth=1.5, color='black')

Operating in single z-value, multiple arrays mode
[13]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f109e891710>
../../_images/notebooks_Frequently_used_code_Contour_extraction_29_2.png

Additional information

License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Australia data is licensed under the Creative Commons by Attribution 4.0 license.

Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the open-data-cube tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on Github.

Last modified: December 2019

Compatible datacube version:

[14]:
print(datacube.__version__)
1.7+128.gebdc898a.dirty

Tags

Browse all available tags on the DEA User Guide’s Tags Index

Tags: NCI compatible, sandbox compatible, sentinel 2, dea_datahandling, dea_spatialtools, dea_bandindices, load_ard, calculate_indices, subpixel_contours, download_unzip, contour extraction, image processing, NDWI, downloading data, digital elevation model