Introduction to Digital Earth Australia 1abb6f92ddb04779af51461bd1feb554

  • Sign up to the DEA Sandbox to run this notebook interactively from a browser

  • Compatibility: Notebook currently compatible with both the NCI and DEA Sandbox environments

  • Prerequisites: Users of this notebook should have a basic understanding of:

Background

Digital Earth Australia (DEA) is a digital platform that catalogues large amounts of Earth observation data covering continental Australia. It is underpinned by the Open Data Cube (ODC), an open source software package that has an ever growing number of users, contributors and implementations.

The ODC and DEA platforms are designed to:

  • Catalogue large amounts of Earth observation data

  • Provide a Python based API for high performance querying and data access

  • Give users easy ability to perform exploratory data analysis

  • Allow scalable continent-scale processing of the stored data

  • Track the provenance of data to allow for quality control and updates

The DEA program catalogues data from a range of satellite sensors and has adopted processes and terminology that users should be aware of to enable efficient querying and use of the datasets stored within. This notebook introduces these important concepts and forms the basis of understanding for the remainder of the notebooks in this beginner’s guide. Resources to further explore these concepts are recommended at the end of the notebook.

Description

This introduction to DEA will briefly introduce the ODC and review the types of data catalogued in the DEA platform. It will also cover commonly-used terminology for measurements within product datasets. Topics covered include:

  • A brief introduction to the ODC

  • A review of the satellite sensors that provide data to DEA

  • An introduction to analysis ready data and the processes to make it

  • DEA’s data naming conventions

  • Coordinate reference scheme

  • Derived products


Open Data Cube

Open Data Cube logo

The Open Data Cube (ODC) is an open-source software package for organising and analysing large quantities of Earth observation data. At its core, the Open Data Cube consists of a database where data is stored, along with commands to load, view and analyse that data. This functionality is delivered by the datacube-core open-source Python library. The library is designed to enable and support:

  • Large-scale workflows on high performance computing infrastructures

  • Exploratory data analysis

  • Cloud-based services

  • Standalone applications

There are a number of existing implementations of the ODC, including DEA and Digital Earth Africa. More information can be found in the Open Data Cube Manual.

Satellite datasets in DEA

Digital Earth Australia catalogues data from a range of satellite sensors. The earliest datasets of optical satellite imagery in DEA date from 1986. DEA includes data from:

Landsat missions are jointly operated by the United States Geological Survey (USGS) and National Aeronautics and Space Administration (NASA). Sentinel missions are operated by the European Space Agency (ESA). One major difference between the two programs is the spatial resolution: each Landsat pixel represents 30 x 30 m on the ground while each Sentinel-2 pixel represents 10 x 10 m to 60 x 60 m depending on the spectral band.

Spectral bands

All of the datasets listed above are captured by multispectral satellites. This means that the satellites measure primarily light that is reflected from the Earth’s surface in discrete sections of the electromagnetic spectrum, known as spectral bands. Figure 1 shows the spectral bands for recent Landsat and Sentinel-2 sensors, allowing a direct comparison of how each sensor samples the overall electromagnetic spectrum. Landsat 5 TM is not displayed in this image; for reference, it measured light in seven bands that covered the same regions as bands 1 to 7 on Landsat 7 ETM+.

Image

Figure 1: The bands that are detected by each of the satellites are shown in the numbered boxes and the width of each box represents the spectral range that band detects. The bands are overlaid on the percentage transmission of each wavelength returned to the atmosphere from the Earth relative to the amount of incoming solar radiation. The y-axis has no bearing on the comparison of the satellite sensors [source].

Figure 1 highlights that the numbering of the bands relative to the detected wavelengths is inconsistent between sensors. As an example, in the green region of the electromagnetic spectrum (around 560 nm), Landsat 5 TM and Landsat 7 ETM+ detect a wide green region called band 2, where as Landsat 8 OLI detects a slightly narrower region and calls it band 3. Finally, Sentinel-2 MSI (A and B) detects a narrow green region but also calls it band 3. Consequently, when working with different sensors, it is important to understand the differences in their bands, and any impact this could have on an analysis. To promote awareness of these differences, DEA band naming is based on both the spectral band name and sample region. The naming convention will be covered in more detail in the DEA band naming conventions section.

Analysis Ready Data

Digital Earth Australia produces Analysis Ready Data (ARD) for each of the sensors listed above. The ARD standard for satellite data requires that data have undergone a number of processing steps, along with the creation of additional attributes for the data. DEA’s ARD datasets include the following characteristics:

  • Geometric correction: This includes establishing ground position, accounting for terrain (orthorectification) and ground control points, and assessing absolute position accuracy. Geometric calibration means that imagery is positioned accurately on the Earth’s surface and stacked consistently so that sequential observations can be used to track meaningful change over time. Adjustments for ground variability typically use a Digital Elevation Model (DEM).

  • Surface reflectance correction: This includes adjustments for sensor/instrument gains, biases and offsets, include adjustments for terrain illumination and sensor viewing angle with respect to the pixel position on the surface. Once satellite data is processed to surface reflectance, pixel values from the same sensor can be compared consistently both spatially and over time.

  • Observation attributes: Per-pixel metadata such as quality flags and content attribution that enable users to make informed decisions about the suitability of the products for their use. For example, clouds, cloud shadows, missing data, saturation and water are common pixel level attributes.

  • Metadata: Dataset metadata including the satellite, instrument, acquisition date and time, spatial boundaries, pixel locations, mode, processing details, spectral or frequency response and grid projection.

Surface reflectance

Optical sensors, such as those on the Landsat and Sentinel-2 satellites, measure light that has come from the sun and been reflected by the Earth’s surface. The sensor measures the intensity of light in each of its spectral bands (known as “radiance”). The intensity of this light is affected by many factors including the angle of the sun relative to the ground, the angle of the sensor relative to the ground, and how the light interacts with the Earth’s atmosphere on its way to the sensor. Because radiance can be affected by so many factors, it is typically more valuable to determine how much light was originally reflected at the ground level. This is known as bottom-of-atmosphere surface reflectance. Surface reflectance can be calculated by using robust physical models to correct the observed radiance values based on atmospheric conditions, the angle of the sun, sensor geometry and local topography or terrain.

There are many approaches to satellite surface reflectance correction and DEA opts to use two: NBAR and NBART. Users will choose which of these measurements to load when querying the DEA datacube and so it is important to understand their major similarities and differences.

NBAR

NBAR stands for Nadir-corrected BRDF Adjusted Reflectance, where BRDF stands for Bidirectional reflectance distribution function. The approach involves atmospheric correction to compute bottom-of-atmosphere radiance, and bi-directional reflectance modelling to remove the effects of topography and angular variation in reflectance. NBAR can be useful for analyses in extremely flat areas not affected by terrain shadow, and for producing attractive data visualisations that are not affected by NBART’s nodata gaps (see below).

NBART

NBART has the same features of NBAR but includes an additional terrain illumination reflectance correction and as such considered to be actual surface reflectance as it takes into account the surface topography. Terrain affects optical satellite images in a number of ways; for example, slopes facing the sun receive more sunlight and appear brighter compared to those facing away from the sun. To obtain comparable surface reflectance from satellite images covering hilly areas, it is therefore necessary to process the images to reduce or remove the topographic effect. This correction is performed with a Digital Surface Model (DSM) that has been resampled to the same resolution as the satellite data being corrected. NBART is typically the default choice for most analyses as removing terrain illumination and shadows allows changes in the landscape to be compared more consistently across time. However, it can be prone to distortions in extremely flat areas if noisy elevation values exist in the DSM.

Comparison between NBAR and NBART

Figure 2: The animation above demonstrates how the NBART correction results in a significantly more two-dimensional looking image that is less affected by terrain illumination and shadow. Black pixels in the NBART image represent areas of deep terrain shadow that can’t be corrected as they’re determined not to be viewable by either the sun or the satellite. These are represented by -999 nodata values in the data.

Observation Attributes

The Observation Attributes (OA) are a suite of measurements included in DEA’s analysis ready datasets. They are an assessment of each image pixel to determine if it is an unobscured, unsaturated observation of the Earth’s surface, along with whether the pixel is represented in each spectral band. The OA product allows users to exclude pixels that do not meet the quality criteria for their analysis. The capacity to automatically exclude such pixels is essential for analysing any change over time, since poor-quality pixels can significantly alter the percieved change over time. The most common use of OA is for cloud masking, where users can choose to remove images that have too much cloud, or ignore the clouds within each satellite image. A demonstration of how to use cloud masking can be found in the masking data notebook.

The OA suite of measurements include the following observation pixel-based attributes:

  • Null pixels

  • Clear pixels

  • Cloud pixels

  • Cloud shadow pixels

  • Snow pixels

  • Water pixels

  • Terrain shaded pixels

  • Spectrally contiguous pixels (i.e. whether a pixel contains data in every spectral band)

Also included is a range of pixel-based attributes related to the satellite, solar and sensing geometries:

  • Solar zenith

  • Solar azimuth

  • Satellite view

  • Incident angle

  • Exiting angle

  • Azimuthal incident

  • Azimuthal exiting

  • Relative azimuth

  • Timedelta

Data format

DEA band naming conventions

To account for the various available satellite datasets, DEA uses a band naming convention to help distinguish datasets that come from the different sensors. The band names are comprised of the applied surface reflectance correction (NBAR or NBART) and the spectral region detected by the satellite. This removes all reference to the sensor band numbering scheme (e.g. Figure 1) and assumes that users understand that the spectral region described by the DEA band name is only approximately the same between sensors, not identical.

Table 1 summarises the DEA band naming terminology for the spectral regions common to both Landsat and Sentinel, coupled with the corresponding NBAR and NBAR band names for the available sensors:

Spectral region

DEA measurement name (NBAR)

DEA measurement name (NBAR)

Landsat 5TM

Landsat 7ETM+

Landsat 8OLI

Sentinel-2A,BMSI

Coastal aerosol

nbar_coastal_aerosol

nbart_coastal_aerosol

1

1

Blue

nbar_blue

nbart_blue

1

1

2

2

Green

nbar_green

nbart_green

2

2

3

3

Red

nbar_red

nbart_red

3

3

4

4

NIR (Near infra-red)

nbar_nir (Landsat)nbar_nir_1 (Sentinel-2)

nbart_nir (Landsat) nbart_nir_1 (Sentinel-2)

4

4

5

8

SWIR 1 (Short wave infra-red 1)

nbar_swir_1 (Landsat) nbar_swir_2 (Sentinel-2)

nbart_swir_1 (Landsat) nbart_swir_2 (Sentinel-2)

5

5

6

11

SWIR 2 (Short wave infra-red 2)

nbar_swir_2 (Landsat) nbar_swir_3 (Sentinel-2)

nbart_swir_2 (Landsat) nbart_swir_3 (Sentinel-2)

7

7

7

12

Note: Be aware that NIR and SWIR band names differ between Landsat and Sentinel-2 due to the different number of these bands available in Sentinel-2. The nbar_nir Landsat band corresponds to the spectral region covered by Sentinel-2’s nbar_nir_1 band, the nbar_swir_1 Landsat band corresponds to Sentinel-2’s nbar_swir_2 band, and the nbar_swir_2 Landsat band corresponds to Sentinel-2’s nbar_swir_3 band.

DEA satellite data projection and holdings

Keeping with the practices of the Landsat and Sentinel satellite programs, all DEA satellite datasets are projected using the Universal Transverse Mercator (UTM) coordinate reference system. The World Geodectic System 84 (WGS84) ellipsoid is used to model the UTM projection. All data queries default to the WGS84 datum’s coordinate reference system unless specified otherwise.

By default, the spatial extent of the DEA data holdings is approximately the Australian coastal shelf. The actual extent varies based on the sensor and product. The current extents of each DEA product can be viewed using the interactive DEA Datacube Explorer.

Derived products

DEA products

In addition to ARD satellite data, DEA generates a range of products that are derived from Landsat or Sentinel-2 surface reflectance data. These products have been developed to characterise and monitor different aspects of Australia’s natural and built environment, such as mapping the distribution of water and vegetation across the landscape through time. Derived DEA products include:

  • Water Observations from Space (WOfS): WOfS is the world’s first continent-scale map of surface water and provides images and data showing where water has been seen in Australia from 1987 to the present. This map can be used to better understand where water usually occurs across the continent and to plan water management strategies.

  • Fractional Cover (FC): Fractional Cover (FC) is a measurement that splits the landscape into three parts, or fractions; green (leaves, grass, and growing crops), brown (branches, dry grass or hay, and dead leaf litter), and bare ground (soil or rock). DEA uses Fractional Cover to characterise every 25 m square of Australia for any point in time from 1987 to today. This measurement can inform a broad range of natural resource management issues.

  • High and Low Tide Composites (HLTC): The High and Low Tide Composites (HLTC) are imagery mosaics developed to visualise Australia’s coasts, estuaries and reefs at low and high tide, whilst removing the influence of noise features such as clouds, breaking water and sun-glint. These products are highly interpretable, and provide a valuable snapshot of the coastline at different biophysical states.

  • Intertidal Extents Model (ITEM): The Intertidal Extents Model (ITEM) product utilises 30 years of Earth observation data from the Landsat archive to map the extents and topography of Australia’s intertidal mudflats, beaches and reefs; the area exposed between high and low tide.

  • National Intertidal Digital Elevation Model (NIDEM): The National Intertidal Digital Elevation Model (NIDEM) is a national dataset that maps the three-dimensional structure of Australia’s intertidal zone. NIDEM provides a first-of-its kind source of intertidal elevation data for Australia’s entire coastline.

Each of the products above have dataset-specific naming conventions, measurements, resolutions, data types and coordinate reference systems. For more information about DEA’s derived products, refer to the DEA website, the Content Management Interface (CMI) containing detailed product metadata, or the “DEA_products” notebooks in this repository.


Additional information

License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Australia data is licensed under the Creative Commons by Attribution 4.0 license.

Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the open-data-cube tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on GitHub.

Last modified: December 2023

Tags

Tags: sandbox compatible, NCI compatible, sensors, band names, NBAR, NBAR, observation attributes, naming conventions