Produce multiband GeoTIFFs for each Sentinel2 observation, given a shapefile

Authors:

Bex Dunn

Created:

Feb 22, 2019

Last edited:

Oct 16, 2019

Requirements:

You need to run the following commands from the command line prior to launching jupyter notebooks from the same terminal so that the required libraries and paths are set:

module use /g/data/v10/public/modules/modulefiles

module load dea

This notebook assumes you have cloned the dea-notebooks repository by following the instructions on DEA notebooks using command line git or DEA notebooks using Github. If you didn’t, you will need to download any scripts that fail in the first cell into your relative directory so that the notebook will have access to those scripts. "../" means the directory above the directory you are working in, and "./" means the current directory.

If you find an error or bug in this notebook, please either create an ‘Issue’ in the Github repository, or fix it yourself and create a ‘Pull’ request to contribute the updated notebook back into the repository (See the repository README for instructions on creating a Pull request).

Background:

Data from the [Copernicus Sentinel-2] satellite missions (https://sentinel.esa.int/web/sentinel/missions/sentinel-2) are accessible through Digital Earth Australia (DEA). More information on the Sentinel2 datasets can be found in the [Introduction to Sentinel 2 notebook in dea-notebooks](https://github.com/GeoscienceAustralia/dea-notebooks/blob/master/02_DEA_datasets/Introduction_to_Sentinel2.ipynb

What does this notebook do?:

This notebook uses theload_clearsentinel2 function to import a time series of cloud-free observations from both Sentinel 2 satellites (i.e. S2A and S2B) as a single combined xarray dataset. It creates an output textfile listing the bands in the GeoTIFF and writes the data for each date to a separate GeoTIFF.

**Tags**: :index:`Sentinel2`, :index:`DEAPlotting`, :index:`rgb`, :index:`dc.load`, :index:`query`, :index:`plot`, :index:`polygondrill`, :index:`shapefile`, :index:`geopolygon`, :index:`datacube.utils.geometry`, :index:`fiona`, :index:`rasterio`, :index:`query`, :index:`Scripts`,:index:`tasseled_cap`, :index:`DEADataHandling`, :index:`DEAPlotting`, :index:`masking`,:index:`make mask`,:index:`write_geotiff`, :index:`multi_timesteps_geotiffs`, :index:`load_clearsentinel2`, :index:`Scripts`, :index:`Outputting data`

import modules and scripts

[1]:
%matplotlib inline

from datacube import Datacube
import datetime
import fiona
import geopandas as gpd
import numpy as np
import pandas as pd
import rasterio
from shapely import geometry
import seaborn as sns
import sys
import xarray as xr

import matplotlib.dates as mdates
import matplotlib.gridspec as gridspec
import matplotlib.pyplot as plt

from datacube.storage import masking
from datacube.utils import geometry
from datacube.helpers import ga_pq_fuser, write_geotiff

# Point this to where you have the algorithms from the dea-notebooks/algorithms saved
sys.path.append('../10_Scripts')

import DEADataHandling, DEAPlotting
dc = Datacube(app='Sentinel2')

set path to polygon here

[2]:
#change the path here if you want a different polygon
poly_path = '/g/data/r78/rjd547/groundwater_activities/Burdekin/Burdekin_shapefiles/reeves_lake_for_demo.shp'

plot polygon to check it looks ok

[3]:
#plot polygon to check it looks ok
plt.clf()
shape_plot = gpd.read_file(poly_path)
shape_plot.plot()
plt.show()
<Figure size 432x288 with 0 Axes>
../../_images/notebooks_08_Outputting_data_Sentinel2ScenesToGeoTIFF_8_1.png
[4]:
GEOM, SHAPE_NAME = DEADataHandling.open_polygon_from_shapefile(poly_path)

query = {
    'time': ('2016-05-01', '2016-05-30'),
    'geopolygon': GEOM,
    'output_crs': 'EPSG:3577',
    'resolution': (-10, 10)
}

Load cloud free Sentinel data for all sensors (S2A, S2B) for the above query. Setting satellite_metadata=True will return the data with a variable that gives the abbreviation of the satellite that made the observation. Masked_prop allows removing timesteps with missing data greater than masked proportion, useful when animating/communicating results.

[5]:
s2 = DEADataHandling.load_clearsentinel2(dc=dc, query=query, sensors=['s2a', 's2b'],
                                                  bands_of_interest=['fmask',
                                                                     'nbart_blue',
                                                                     'nbart_green',
                                                                     'nbart_red',
                                                                     'nbart_red_edge_1',
                                                                     'nbart_red_edge_2',
                                                                     'nbart_red_edge_3',
                                                                     'nbart_nir_1',
                                                                     'nbart_nir_2',
                                                                     'nbart_swir_2',
                                                                     'nbart_swir_3'],
                                                  masked_prop=0,
                                                  satellite_metadata=True)
Loading s2a pixel quality
    Loading 3 filtered s2a timesteps
Loading s2b pixel quality
    Skipping s2b; no valid data for query
Combining and sorting s2a data
    Replacing invalid -999 values with NaN (data will be coerced to float64)

Set up save file parameters

[7]:
#get polygon name from the polygon path
polyname = poly_path.split('/')[-1].split('.')[0]
[8]:
savefilepath = '/g/data/r78/rjd547/WaterCompHackFeb2019/Sentinel2Data/'+polyname
[9]:
filename=savefilepath
[10]:
filename
[10]:
'/g/data/r78/rjd547/WaterCompHackFeb2019/Sentinel2Data/reeves_lake_for_demo'

The new Sentinel data has multiple data types. GeoTIFFs need only one data type. So we coerce the data to one dtype to write to GeoTIFF.

[11]:
s2['fmask']=s2['fmask'].astype(np.float64)

we also drop two information layers that didn’t coerce well

[12]:
s2 = s2.drop(['satellite','data_perc'])

Export multiple geotiffs, one for each time-slice of an xarray

If we want to export all of the time steps in a dataset as geotiffs, we can wrap our write_geotiff function in a for-loop.

[13]:
for i in range(len(s2.time)):

    # We'll use the date of the satellite image to name the geotiff
    date = str(s2.isel(time=i).time.data)[:-19]
    print(f'Writing {date}')

    # Write geotiff
    write_geotiff(f'{filename}{date}.tif', s2.isel(time=i))
Writing 2016-05-09
Writing 2016-05-19
Writing 2016-05-29