# Introduction to Sentinel 2¶

What does this notebook do? This notebook steps you through how to load in and plot up data from Sentinel 2. It explores the data that are available from the Sentinel 2 satellite, and briefly describes the Sentinel satellite bands. It then loads in the s2a_ard_granule product and plots it up in true and false colour. It uses the provided pixel quality data to filters the example scene based on cloud and cloud shadow-free pixels.

Before you run this notebook: You need to run the following commands from the command line prior to launching jupyter notebook from the same terminal so that the required libraries and paths are set.

module use /g/data/v10/public/modules/modulefiles

module load dea

Date: November 2018

Author: Claire Krause, Robbi Bishop-Taylor

Tags: Sentinel2, products, DEAPlotting, rgb, dc.list_products, dc.load, query, beginner, plot, image, pixelquality

[1]:

%matplotlib inline
from datacube import Datacube
from datetime import datetime
from skimage import exposure
import matplotlib.pyplot as plt
import numpy as np
import sys
import os

# Point this to where you have the algorithms from the dea-notebooks/algorithms saved
sys.path.append(os.path.abspath('../10_Scripts'))
import DEAPlotting

# Connect to datacube
dc = Datacube(app='Intro to Sentinel2')


## Sentinel 2 products currently available¶

[2]:

products = dc.list_products()
display_columns = ['name', 'description', 'instrument',
'platform', 'product_type', 'crs', 'resolution']
sentinel_products = products[products['instrument'] == 'MSI'][display_columns]
sentinel_products

[2]:

name description instrument platform product_type crs resolution
id
90 s2a_ard_granule Sentinel-2A MSI ARD - NBAR NBART and Pixel Qua... MSI SENTINEL_2A S2MSIARD NaN NaN
92 s2a_level1c_granule Sentinel-2A Level1C - Ortho Rectified Top of A... MSI Sentinel-2A S2MSI1C NaN NaN
91 s2b_ard_granule Sentinel-2B MSI ARD - NBAR NBART and Pixel Qua... MSI SENTINEL_2B S2MSIARD NaN NaN
93 s2b_level1c_granule Sentinel-2B Level1C - Ortho Rectified Top of A... MSI Sentinel-2B S2MSI1C NaN NaN

There are two spectral Sentinel satellites currently in DEA; Sentinel-2A and Sentinel-2B. Sentinel-2A was launched on 23 June 2015 and Sentinel-2B followed on 7 March 2017.

Both of the Sentinel 2 satellites carries an innovative wide swath high-resolution multispectral imager with 13 spectral bands. The mission is based on a constellation of two identical satellites in the same orbit, 180° apart for optimal coverage and data delivery. Together they cover all Earth’s land surfaces, large islands, inland and coastal waters every five days at the equator.

For more information on the Sentinel 2 platforms and applications, check out the European Space Agency website.

We will focus on data from the Sentinel-2a platform for this demonstration, as there is more data available than 2b. To explore the spectral datasets from Sentinel-2a, we will use the s2a_ard_granule product. Sometimes multiple scenes are acquired by the satellite on the same day. We want to group these together, and will use group_by='solar_day' to do this.

To load in the Sentinel data, we have a number of options we can use in building our data extraction query:

• lat/lon - specify a bounding box for the extraction. Note that a polygon can be used instead. See here for more details

• output_crs - the output coordinate reference system to project the data into. The CRS you request will impact on the format for the resolution query. Two handy CRSs are ‘EPSG:3577’, which is the Australian Albers projected coordinate system, and ‘EPSG:4325’, which is WGS84 a global geographic coordinate system.

• resolution - the requested output resolution for the data. If you have selected a geographic coordinate system for the output_crs, this will be in degrees. If you have selected a projected coordinate system, this will be in metres.

• time - the time range for the query.

[3]:

query = {
'lat': (-35.27, -35.33),
'lon': (149.07, 149.15),
'output_crs': 'EPSG:3577',
'resolution': (-20, 20),
'time': ('2017-01-01', '2017-02-01')
}



### See what came back from the extraction¶

[4]:

Canberra

[4]:

<xarray.Dataset>
Dimensions:                (time: 2, x: 405, y: 380)
Coordinates:
* time                   (time) datetime64[ns] 2017-01-05T00:02:12.026000 2017-01-15T00:02:11.026000
* y                      (y) float64 -3.956e+06 -3.956e+06 ... -3.963e+06
* x                      (x) float64 1.544e+06 1.544e+06 ... 1.553e+06
Data variables:
azimuthal_exiting      (time, y, x) float32 24.900135 ... 127.58696
azimuthal_incident     (time, y, x) float32 23.90676 39.06566 ... 149.86623
exiting                (time, y, x) float32 10.388568 13.48787 ... 2.201671
incident               (time, y, x) float32 36.450188 ... 30.279655
relative_azimuth       (time, y, x) float32 26.755356 ... 28.386116
relative_slope         (time, y, x) float32 -0.9933758 ... 22.279266
satellite_azimuth      (time, y, x) float32 100.571175 ... 100.57671
satellite_view         (time, y, x) float32 4.7305017 4.72948 ... 4.0362
solar_azimuth          (time, y, x) float32 73.81582 73.81592 ... 72.1906
solar_zenith           (time, y, x) float32 30.208754 ... 31.745928
terrain_shadow         (time, y, x) uint8 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1
fmask                  (time, y, x) uint8 2 2 2 2 2 2 2 2 ... 2 2 2 2 2 2 2
nbar_contiguity        (time, y, x) uint8 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1
nbar_coastal_aerosol   (time, y, x) int16 822 807 784 784 ... 442 442 442
nbar_blue              (time, y, x) int16 903 863 858 824 ... 480 300 279
nbar_green             (time, y, x) int16 1051 1029 1013 987 ... 591 467 412
nbar_red               (time, y, x) int16 1301 1269 1249 ... 618 343 367
nbar_red_edge_1        (time, y, x) int16 1619 1539 1494 ... 648 805 685
nbar_red_edge_2        (time, y, x) int16 1927 1862 1838 ... 1503 2163 1953
nbar_red_edge_3        (time, y, x) int16 2308 2199 2170 ... 1916 2850 2938
nbar_nir_1             (time, y, x) int16 2456 2350 2423 ... 1915 3377 2475
nbar_nir_2             (time, y, x) int16 2658 2536 2533 ... 2198 3281 3346
nbar_swir_2            (time, y, x) int16 2862 2780 2707 ... 1387 1525 1585
nbar_swir_3            (time, y, x) int16 1817 1732 1675 ... 846 826 761
nbart_contiguity       (time, y, x) uint8 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1
nbart_coastal_aerosol  (time, y, x) int16 867 862 861 861 ... 432 432 432
nbart_blue             (time, y, x) int16 954 936 946 917 ... 458 293 278
nbart_green            (time, y, x) int16 1113 1118 1119 ... 564 454 409
nbart_red              (time, y, x) int16 1384 1389 1392 ... 587 333 365
nbart_red_edge_1       (time, y, x) int16 1712 1679 1659 ... 616 780 676
nbart_red_edge_2       (time, y, x) int16 2037 2033 2040 ... 1429 2096 1928
nbart_red_edge_3       (time, y, x) int16 2402 2340 2336 ... 1841 2781 2909
nbart_nir_1            (time, y, x) int16 2566 2504 2614 ... 1839 3304 2464
nbart_nir_2            (time, y, x) int16 2767 2697 2725 ... 2113 3202 3314
nbart_swir_2           (time, y, x) int16 3031 3041 3013 ... 1317 1476 1564
nbart_swir_3           (time, y, x) int16 1931 1906 1879 ... 801 798 751
Attributes:
crs:      EPSG:3577


### Sentinel 2 bands¶

Sentinel 2 satellites have 13 spectral channels:

Sentinel 2 bands

DEA band name

Band number

Central wavelength (nm)

Resolution (m)

Bandwidth (nm)

Coastal aerosol

nbar(t)_coastal_aerosol

1

443

60

20

Blue

nbar(t)_blue

2

490

10

65

Green

nbar(t)_green

3

560

10

35

Red

nbar(t)_red

4

665

10

30

Vegetation red edge

nbar(t)_red_edge_1

5

705

20

15

Vegetation red edge

nbar(t)_red_edge_2

6

740

20

15

Vegetation red edge

nbar(t)_red_edge_3

7

783

20

20

NIR

nbar(t)_nir_1

8

842

10

115

Narrow NIR

nbar(t)_nir_2

8A

865

20

20

Water vapour

N/A

9

945

60

20

SWIR - Cirrus

N/A

10

1375

60

20

SWIR

nbar(t)_swir_2

11

1610

20

90

SWIR

nbar(t)_swir_3

12

2190

20

180

These bands cover the visible, near-infrared and short-wave infrared wave lengths.

Note There are a number of additional datasets that are alse returned from our query. These are a combination of datasets used in the correction process, and layers relating to pixel quality. We will explore the pixel quality layers below.

### Data corrections¶

There are two corrections applied to the Sentinel data:

1. NBAR (e.g. nbar_green)
NBAR stands for Nadir-corrected BRDF Adjusted Reflectance, where BRDF stands for Bidirectional reflectance distribution function The approach involves atmospheric correction to compute surface-leaving radiance, and bi-directional reflectance modelling to remove the effects of topography and angular variation in reflectance.
2. T_NBAR (e.g. nbart_green)
Surface reflectance T_NBAR includes the terrain illumination reflectance correction and has the same features of NBAR, along with some additional features.

For more information on these corrections, see the explanation on the DEA surface reflectance page.

### Dealing with missing data¶

Note that the nbart data insert a missing value (-9999) into the dataset to denote a terrain shadow. This should be converted to a NaN value before using this data to avoid treating it as a valid value:

[5]:

Canberra = masking.mask_invalid_data(Canberra)


### Plot as a true colour image¶

True colour images approximate what the human eye sees when looking at a landscape. Note that the function used for this true colour plot enhances the contrast between the bands, resulting in a colour-enhanced image.

[6]:

DEAPlotting.rgb(Canberra, bands=['nbart_red', 'nbart_green', 'nbart_blue'], index=1)


### Plot as a false colour image¶

This plot uses the SWIR, NIR and green bands to accentuate the presence of water in the landscape.

[7]:

DEAPlotting.rgb(Canberra, bands=['nbart_swir_2', 'nbart_nir_1', 'nbart_green'], index=1)


### Compare observations side-by-side¶

[17]:

# Select multiple images to plot by giving a list of values to index.
# This example plots the first and the second image in the dataset:
DEAPlotting.rgb(Canberra, bands=['nbart_swir_2', 'nbart_nir_1', 'nbart_green'], index=[0, 1])


## Filter scenes using pixel quality¶

A pixel quality dataset (fmask) is provided alongside the spectral data, which can be used to filter out noisy pixels. The key things we want to filter are clouds and shadows.

The fmask field contains values from 0 to 5.

Value

Description

0

Null

1

Valid

2

Cloud

3

4

Snow

5

Water

Here is the cloudy image we would like to mask:

[9]:

DEAPlotting.rgb(Canberra, bands=['nbar_swir_2', 'nbar_nir_1', 'nbar_green'], index=0)


### Plot pixel quality information¶

Here we plot pixel quality information that corresponds to the observation above. This data is stored in the fmask variable:

[10]:

Canberra.isel(time=0).fmask.plot(figsize=[10, 10])

[10]:

<matplotlib.collections.QuadMesh at 0x7fe6ccc2ed30>


### Use pixel quality information to create a mask, and apply it to the spectral data¶

We want to filter out no data, cloud and cloud shadow, so we will use the fmask values 0, 2 and 3 for our filter.

[11]:

# Numpy's isin function tests whether a pixel's value is in test_elements. Because we want to
# discard pixels where this is the case (because these pixels represent nodata, cloud or cloud shadow),
# we also use invert=True to keep only good quality pixels
clear_pixels = np.isin(Canberra.fmask, test_elements=[0, 2, 3], invert=True)

# Use clear pixels as a mask to keep only clear pixels
Clear_Canberra = Canberra.where(clear_pixels)


[12]:

DEAPlotting.rgb(Clear_Canberra, bands=['nbart_swir_2', 'nbart_nir_1', 'nbart_green'], index=0)


## Use load_clearsentinel2 to automatically load cloud free observations¶

To make loading Sentinel 2 data more convenient, the load_clearsentinel2 function was written to automate common tasks like returning cloud-free time series:

[13]:

query = {
'lat': (-35.27, -35.33),
'lon': (149.07, 149.15),
'output_crs': 'EPSG:3577',
'resolution': (-10, 10),
'time': ('2017-01-01', '2017-02-01')
}

# The load_clearsentinel2 can be given a query containing spatial and temporal bounds, and a list of bands to return.
# By setting mask_pixel_quality=True, the function will automatically remove nodata, cloud or cloud shadow pixels:
query=query,
bands_of_interest=('nbart_red', 'nbart_green', 'nbart_blue',
'nbart_nir_1', 'nbart_swir_2', 'nbart_swir_3'),

Loading s2a pixel quality
Skipping s2b; no valid data for query
Combining and sorting s2a data
Replacing invalid -999 values with NaN (data will be coerced to float64)


We can plot the same observation as earlier to verify we get the same result:

[14]:

DEAPlotting.rgb(Clear_Canberra, bands=['nbart_swir_2', 'nbart_nir_1', 'nbart_green'], index=0)


### Returning only cloud-free images¶

The load_clearsentinel2 function also allows you to return only images that contain greater than a given percentage of good quality pixels. For example, this can be used to return only cloud-free images. To do this, we specify a minimum proportion of good pixels using the masked_prop argument (e.g. specifying 0.3 below will return only images with more than 30% good quality, cloud-free pixels).

Please note: the Sentinel 2 cloud masking algorithm currently performs poorly over urban areas, where it commonly confuses buildings with clouds. Because of this, you make need to use a small value for masked_prop to ensure the algorithm does not discard an excessive number of images.

[15]:

Clear_Canberra = DEADataHandling.load_clearsentinel2(dc=dc,
query=query,
bands_of_interest=('nbart_red', 'nbart_green', 'nbart_blue',
'nbart_nir_1', 'nbart_swir_2', 'nbart_swir_3'),

Loading s2a pixel quality

[16]:

DEAPlotting.rgb(Clear_Canberra, bands=['nbart_swir_2', 'nbart_nir_1', 'nbart_green'], index=0)

[ ]: