ARD overpass predictor ¶
Sign up to the DEA Sandbox to run this notebook interactively from a browser
Compatibility: Notebook currently compatible with the
Special requirements: This notebook loads data from an external .csv file (
overpass_input.csv) from the
Supplementary_datafolder of this repository
Knowing the time of a satellite overpass (OP) at a field site is necessary to plan field work activities. While predicting the timesteps for a single field site that receives only one overpass from a given satellite is easy, it gets more complicated when you need overpass information across multiple field sites. For most sites, they will lie in a single descending (daytime) satellite acquisition. For Landsat 8 this occurs every 16 days and for Sentinel 2A / 2B this occurs every 10 days.
This notebook can be used to output a list of ordered overpasses for given field sites by providing an initial timestamp for field sites of interest. Output for multiple sites are ordered by date such that dual overpasses can be identified.
See “overpass_input.csv” as an example input file - this contains a number of field sites, with some that receive multiple acquisitions.
You must provide a start date + time for the overpass of the site you are interested in - go to https://nationalmap.gov.au/ and https://evdc.esa.int/orbit/ to get an overpass for your location (lat/long) and add this to the input file.
Take care with sites that receive multiple acquisitions, ie those that lie in the overlap of 2 acquisitions.
Specify an output file name
Provide an input file - this can be the example included or your own field site information
Specify which field sites receive extra overpasses in an orbital period, ie those which lie in satellite imaging overlap
Specify output field sites - you can select a subset of sites of interest (if adding your site to the original example file)
Combine field site overpasses
If you wish to add or alter sites to predict for, edit the input file located at:
And modify the input file path to reflect. You will need to leave the 1st site in the input file (Mullion) as the notebook requires this to order multiple overpasses.
To run the overpass predictor with the given input file, run all cells in the notebook starting with the “Load packages” cell.
import pandas as pd import numpy as np from datetime import datetime, timedelta
Specify output file name¶
This is the name of the .csv the notebook will output. This .csv will contain your combined predictions.
# You can rename the .csv to any desired output filename here. Default is "output_overpass_pred.csv" output_path = '../Supplementary_data/ARD_overpass_predictor/output_overpass_pred.csv'
Specify input file containing initial overpass data for a given site¶
Read in base overpass file
Input date times must be in UTC
Make sure the necessary dates for your site are imported correctly - should be YYYY-MM-DD HH:MM:SS in 24hr format
# Read in base file 'overpass_input.csv' # Edit this file with new field sites, or load your own file with field sites as needed. # You don't need to input Path / Row or Lat / Long. The notebook only requires start times for the relevant satellites. base_file_path = '../Supplementary_data/ARD_overpass_predictor/overpass_input.csv' overpass = pd.read_csv(base_file_path, index_col='Site', parse_dates=['landsat_8', 'sentinel_2a', 'sentinel_2b', 'landsat_8_2', 'sentinel_2a_2', 'sentinel_2b_2']) overpass #"NaT" values are expected - indicates a date was not entered in input file.
|Mullion||-35.123||148.862||91||84||2019-11-29 23:43:00||2019-11-20 23:49:00||2019-12-20 23:58:00||2019-12-14 00:07:00||2019-12-15 23:58:00||2019-12-19 00:07:00|
|Lake_George||-35.094||149.463||90||84||2019-11-29 23:43:00||NaT||2019-12-20 23:58:00||NaT||2019-12-15 23:58:00||NaT|
|Narrabundah||-35.334||149.145||90||85||2019-11-29 23:43:00||2019-11-20 23:49:00||2019-12-20 23:58:00||NaT||2019-12-15 23:58:00||NaT|
# Set a list for field sites included in prediction. If adding to the field sites, please leave the 1st entry, "Mullion", # as this is used to order secondary overpasses. sites = list(overpass.index) sites # outputs 1st site. Should be 'Mullion'
# Set satellite timesteps for overpasses # Depending on your application, you may want to add more accurate timesteps - but note that times used as inputs are # for the beginning of the acquisition, not the actual overpass time at a specific site. l8_timestep = timedelta(days=16) s2a_timestep = timedelta(days=10) s2b_timestep = timedelta(days=10)
# Specify start date l8_startdate = overpass['landsat_8'] l8_startdate_2 = overpass['landsat_8_2'] sentinel2a_startdate = overpass['sentinel_2a'] sentinel2a_2_startdate = overpass['sentinel_2a_2'] sentinel2b_startdate = overpass['sentinel_2b'] sentinel2b_2_startdate = overpass['sentinel_2b_2'] sentinel2a_startdate
Site Mullion 2019-12-20 23:58:00 Lake_George 2019-12-20 23:58:00 Narrabundah 2019-12-20 23:58:00 Name: sentinel_2a, dtype: datetime64[ns]
The following cell calculates times for Landsat and Sentinel overpasses for field sites. This is done by multiplying a number of desired iterations by the overpass frequency.
Overpass frequencies for Landsat 8 = 16 days, Sentinel = 10 days. Increasing the number X in “for i in range(X):” will increase the number of iterations to predict over per satellite. An initial value of 20 iterations is set for L8, and 32 for Sentinel to result in the same total time period (320 days) to predict over.
Predictions are also calculated for secondary overpasses for each site.
# Landsat 8 overpass prediction for 20*(overpass frequency) - ie 20*16 = 320 days. You can change this as desired, # to get overpass predictions for n days. n = x*(OP freq). X is arbitrarily set to 20 to give 320 days as a base example landsat = list() for i in range(20): landsat.append(l8_startdate + l8_timestep*(i)) landsat = pd.DataFrame(landsat) # Sentinel 2a overpass prediction for 32 * the overpass frequency, this is to give a similar total time to the L8 prediction. # OP frequency for Sentinel = 10 days, n days = 32*10 = 320 Sentinel_2A =  for i in range(32): Sentinel_2A.append(sentinel2a_startdate + s2a_timestep * (i)) Sentinel_2A = pd.DataFrame(Sentinel_2A) # Sentinel 2b Sentinel_2B =  for i in range(32): Sentinel_2B.append(sentinel2b_startdate + s2b_timestep * (i)) Sentinel_2B = pd.DataFrame(Sentinel_2B) # Landasat_2 # Prediction for L8 overpasses at sites which are covered by more than 1 overpass in a 16-day period. # - only Mullion in example .csv landsat_2 =  for i in range(20): landsat_2.append(l8_startdate_2 + l8_timestep*(i)) landsat_2 = pd.DataFrame(landsat_2) # Sentinel_2A_2 # Sentinel 2a secondary (Mullion) Sentinel_2A_2 =  for i in range(32): Sentinel_2A_2.append(sentinel2a_2_startdate + s2a_timestep * (i)) Sentinel_2A_2 = pd.DataFrame(Sentinel_2A_2) # Sentinel_2B_2 # Sentinel 2b secondary (Lake_George, Narrabundah) Sentinel_2B_2 =  for i in range(32): Sentinel_2B_2.append(sentinel2b_2_startdate + s2b_timestep * (i)) Sentinel_2B_2 = pd.DataFrame(Sentinel_2B_2)
Append dataframes for sites that recieve more than 1 overpass in an orbital period.¶
If adding sites to the input file, please leave in “Mullion” - the 1st entry, as this is used to order secondary overpasses by.
# Combine Landsat 8 data (base plus extra overpasses) L8_combined2 = landsat.append(landsat_2) drop_label_L8 = L8_combined2.reset_index(drop=True) L8_combined2 = drop_label_L8.sort_values(by='Mullion') L8_combined2.index.names = ['Landsat_8'] # Combine Sentinel 2A data (base plus extra overpasses) S2A_combined = Sentinel_2A.append(Sentinel_2A_2) drop_label_S2A = S2A_combined.reset_index(drop=True) S2A_combined = drop_label_S2A.sort_values(by='Mullion') S2A_combined.index.names = ['Sentinel_2A'] # Combine Sentinel 2B data (base plus extra overpasses) S2B_combined = Sentinel_2B.append(Sentinel_2B_2) drop_label_S2B = S2B_combined.reset_index(drop=True) S2B_combined = drop_label_S2B.sort_values(by='Mullion') S2B_combined.index.names = ['Sentinel_2B'] # Add satellite label to each entry in "Sat" column S2A_combined['Sat'] = 'S2A' S2B_combined['Sat'] = 'S2B' L8_combined2['Sat'] = 'L8' combined1 = S2B_combined.append(S2A_combined) combined = combined1.append(L8_combined2) df = pd.DataFrame(combined)
# Use a "time dummy" to force a re-order of data by date, to standardise across dataframes. today = datetime.today() timedummy =  t0 = datetime(today.year, today.month, 1) dummystep = timedelta(days=2) for i in range(300): timedummy.append(t0 + dummystep * (i)) timedummy = pd.DataFrame(timedummy) df['DateStep'] = timedummy
Output field sites¶
For new field sites, comment “#” out the old, add the new in the same format.
# Create a new df for each field site of interest. To add a new field site, add in same format ie # "Field_Site = df[['Field_Site', 'Sat', 'DateStep']].copy()" - if you are appending to the original file simply add in "site4 = sites" # and "Site4 = df[[(site4), 'Sat', 'DateStep']].copy()" in the appropriate section below. site1 = sites site2 = sites site3 = sites #site4 = sites Site1 = df[[(site1), 'Sat', 'DateStep']].copy() Site2 = df[[(site2), 'Sat', 'DateStep']].copy() Site3 = df[[(site3), 'Sat', 'DateStep']].copy() #Site4 = df[[(site4), 'Sat', 'DateStep']].copy()
# Reorder by date for each site, include satellite tags for each date. # Add new lines for new sites as needed. Site1 = (Site1.sort_values(by=[(site1), 'DateStep'])).reset_index(drop=True) Site2 = (Site2.sort_values(by=[(site2), 'DateStep'])).reset_index(drop=True) Site3 = (Site3.sort_values(by=[(site3), 'DateStep'])).reset_index(drop=True) #Site4 = (Site4.sort_values(by=[(site4), 'DateStep'])).reset_index(drop=True)
Optional - output individual field site.¶
if you wish to output multiple field sites, skip this step.
# Use this cell to ouput an individual field site # Write individual site to excel: take out "#" symbol below, or add new line for new site in this format: #Site1.to_excel('Site1.csv') #Site2.to_excel('Site2.csv')
The overpass predictions are combined and output to the specified “output_path”. Times are in the same timezone as the input file. The example file used is in UTC, but you can also run the notebook on local times.
To convert UTC (or whatever your input time was) to local time, ie AEST (or any other time zone), add 10h to the times or 11 for AEDT. This part is in the below cell, commented out. Remove the #’s to run the command on your desired column, and it will add the timestep.
Yay, you have a list of overpasses!
combined_sites = ([Site1, Site2, Site3]) ## If you have more sites, ie you have added 1 or more, add here! An example is provided below if you have a total of 5 sites. #combined_sites = ([Site1, Site2, Site3, Site4, Site5]) merged = pd.concat(combined_sites, axis=1) merged = merged.rename_axis("Overpass", axis="columns") output = merged.drop(['DateStep'], axis=1) #add timestep: (comment out / delete hash to use as applicable - must include column/site name) - ignore the error #output['Mullion']=output['Mullion']+datetime.timedelta(hours=10) #output['Lake_George']=output['Lake_George']+datetime.timedelta(hours=10) #output['Narrabundah']=output['Narrabundah']+datetime.timedelta(hours=10) output.to_csv(output_path) output.head(20)
|0||2019-11-20 23:49:00||L8||2019-11-29 23:43:00||L8||2019-11-20 23:49:00||L8|
|1||2019-11-29 23:43:00||L8||2019-12-15 23:43:00||L8||2019-11-29 23:43:00||L8|
|2||2019-12-06 23:49:00||L8||2019-12-15 23:58:00||S2B||2019-12-06 23:49:00||L8|
|3||2019-12-14 00:07:00||S2A||2019-12-20 23:58:00||S2A||2019-12-15 23:43:00||L8|
|4||2019-12-15 23:43:00||L8||2019-12-25 23:58:00||S2B||2019-12-15 23:58:00||S2B|
|5||2019-12-15 23:58:00||S2B||2019-12-30 23:58:00||S2A||2019-12-20 23:58:00||S2A|
|6||2019-12-19 00:07:00||S2B||2019-12-31 23:43:00||L8||2019-12-22 23:49:00||L8|
|7||2019-12-20 23:58:00||S2A||2020-01-04 23:58:00||S2B||2019-12-25 23:58:00||S2B|
|8||2019-12-22 23:49:00||L8||2020-01-09 23:58:00||S2A||2019-12-30 23:58:00||S2A|
|9||2019-12-24 00:07:00||S2A||2020-01-14 23:58:00||S2B||2019-12-31 23:43:00||L8|
|10||2019-12-25 23:58:00||S2B||2020-01-16 23:43:00||L8||2020-01-04 23:58:00||S2B|
|11||2019-12-29 00:07:00||S2B||2020-01-19 23:58:00||S2A||2020-01-07 23:49:00||L8|
|12||2019-12-30 23:58:00||S2A||2020-01-24 23:58:00||S2B||2020-01-09 23:58:00||S2A|
|13||2019-12-31 23:43:00||L8||2020-01-29 23:58:00||S2A||2020-01-14 23:58:00||S2B|
|14||2020-01-03 00:07:00||S2A||2020-02-01 23:43:00||L8||2020-01-16 23:43:00||L8|
|15||2020-01-04 23:58:00||S2B||2020-02-03 23:58:00||S2B||2020-01-19 23:58:00||S2A|
|16||2020-01-07 23:49:00||L8||2020-02-08 23:58:00||S2A||2020-01-23 23:49:00||L8|
|17||2020-01-08 00:07:00||S2B||2020-02-13 23:58:00||S2B||2020-01-24 23:58:00||S2B|
|18||2020-01-09 23:58:00||S2A||2020-02-17 23:43:00||L8||2020-01-29 23:58:00||S2A|
|19||2020-01-13 00:07:00||S2A||2020-02-18 23:58:00||S2A||2020-02-01 23:43:00||L8|
Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the
open-data-cube tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on
Last modified: September 2021
Tags: sandbox compatible, sentinel 2, landsat 8