{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction to loading data \n",
"\n",
"* **[Sign up to the DEA Sandbox](https://app.sandbox.dea.ga.gov.au/)** to run this notebook interactively from a browser\n",
"* **Compatibility:** Notebook currently compatible with both the `NCI` and `DEA Sandbox` environments\n",
"* **Products used:** \n",
"[ga_ls7e_gm_cyear_3](https://explorer.dea.ga.gov.au/ga_ls7e_gm_cyear_3),\n",
"[ga_ls8cls9c_gm_cyear_3](https://explorer.dea.ga.gov.au/ga_ls8cls9c_gm_cyear_3)\n",
"* **Prerequisites:** Users of this notebook should have a basic understanding of:\n",
" * How to run a [Jupyter notebook](01_Jupyter_notebooks.ipynb)\n",
" * The basic structure of the DEA [satellite datasets](02_DEA.ipynb)\n",
" * Inspecting available [DEA products and measurements](03_Products_and_measurements.ipynb)\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Background\n",
"Loading data from the [Digital Earth Australia (DEA)](https://www.ga.gov.au/dea) instance of the [Open Data Cube](https://www.opendatacube.org/) requires the construction of a query that specifies the what, where, and when of the data request.\n",
"Each query returns a [multi-dimensional xarray object](http://xarray.pydata.org/en/stable/) containing the contents of your query.\n",
"It is essential to understand the `xarray` data structures as they are fundamental to the structure of data loaded from the datacube.\n",
"Manipulations, transformations and visualisation of `xarray` objects provide datacube users with the ability to explore and analyse DEA datasets, as well as pose and answer scientific questions."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Description\n",
"This notebook will introduce how to load data from the DEA datacube through the construction of a query and use of the `dc.load()` function.\n",
"Topics covered include:\n",
"\n",
"1. Loading data using `dc.load()`\n",
"2. Interpreting the resulting `xarray.Dataset` object\n",
" * Inspecting an individual `xarray.DataArray`\n",
"3. Customising parameters passed to the `dc.load()` function\n",
" * Loading specific measurements\n",
" * Loading data for coordinates in a custom coordinate reference system (CRS)\n",
" * Projecting data to a new CRS and spatial resolution \n",
" * Specifying a specific spatial resampling method\n",
"4. Loading data using a reusable dictionary query\n",
"5. Loading matching data from multiple products using `like`\n",
"6. Adding a progress bar to the data load\n",
"\n",
"***"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Getting started\n",
"To run this introduction to loading data from DEA, run all the cells in the notebook starting with the \"Load packages\" cell. For help with running notebook cells, refer back to the [Jupyter Notebooks notebook](01_Jupyter_notebooks.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load packages\n",
"The `datacube` package is required to query the datacube database and load some data. \n",
"The `with_ui_cbk` function from `odc.ui` enables a progress bar when loading large amounts of data."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import datacube\n",
"from odc.ui import with_ui_cbk"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Connect to the datacube\n",
"The next step is to connect to the datacube database.\n",
"The resulting `dc` datacube object can then be used to load data.\n",
"The `app` parameter is a unique name used to identify the notebook that does not have any effect on the analysis."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"dc = datacube.Datacube(app=\"04_Loading_data\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Loading data using `dc.load()`\n",
"\n",
"Loading data from the datacube uses the [dc.load()](https://datacube-core.readthedocs.io/en/latest/api/indexed-data/generate/datacube.Datacube.load.html) function.\n",
"\n",
"The function requires the following minimum arguments:\n",
"\n",
"* `product`: The data product to load (to revise DEA products, see the [Products and measurements](03_Products_and_measurements.ipynb) notebook).\n",
"* `x`: The spatial region in the *x* dimension. By default, the *x* and *y* arguments accept queries in a geographical co-ordinate system WGS84, identified by the EPSG code *4326*.\n",
"* `y`: The spatial region in the *y* dimension. The dimensions ``longitude``/``latitude`` and ``x``/``y`` can be used interchangeably.\n",
"* `time`: The temporal extent. The time dimension can be specified using a tuple of datetime objects or strings in the \"YYYY\", \"YYYY-MM\" or \"YYYY-MM-DD\" format. \n",
"\n",
"For example, to load 2015 data from the [Landsat 8 NBAR-T annual geomedian product](https://explorer.dea.ga.gov.au/ga_ls8c_nbart_gm_cyear_3) for Moreton Bay in southern Queensland, use the following parameters:\n",
"\n",
"* `product`: `ga_ls8cls9c_gm_cyear_3`\n",
"* `x`: `(153.3, 153.4)`\n",
"* `y`: `(-27.5, -27.6)`\n",
"* `time`: `(\"2015-01-01\", \"2015-12-31\")`\n",
"\n",
"Run the following cell to load all datasets from the `ga_ls8cls9c_gm_cyear_3` product that match this spatial and temporal extent:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"
<xarray.Dataset> Size: 4MB\n", "Dimensions: (time: 1, y: 424, x: 384)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2015-07-02T11:59:59.999999\n", " * y (y) float64 3kB -3.156e+06 -3.156e+06 ... -3.168e+06\n", " * x (x) float64 3kB 2.067e+06 2.067e+06 ... 2.079e+06 2.079e+06\n", " spatial_ref int32 4B 3577\n", "Data variables:\n", " nbart_blue (time, y, x) int16 326kB 469 471 475 480 ... 313 277 257 269\n", " nbart_green (time, y, x) int16 326kB 510 513 518 524 ... 489 431 363 366\n", " nbart_red (time, y, x) int16 326kB 232 235 238 241 ... 376 332 311 322\n", " nbart_nir (time, y, x) int16 326kB 94 94 95 96 ... 2691 2437 2132 2237\n", " nbart_swir_1 (time, y, x) int16 326kB 56 57 56 57 ... 1432 1177 1018 1059\n", " nbart_swir_2 (time, y, x) int16 326kB 46 46 46 48 47 ... 716 579 490 496\n", " sdev (time, y, x) float32 651kB 0.003487 0.003262 ... 0.001991\n", " edev (time, y, x) float32 651kB 133.7 127.6 125.6 ... 176.0 169.4\n", " bcdev (time, y, x) float32 651kB 0.09439 0.09096 ... 0.0404 0.03889\n", " count (time, y, x) int16 326kB 16 16 16 16 16 15 ... 12 12 12 12 12\n", "Attributes:\n", " crs: EPSG:3577\n", " grid_mapping: spatial_ref
<xarray.DataArray 'nbart_nir' (time: 1, y: 424, x: 384)> Size: 326kB\n", "array([[[ 94, 94, 95, ..., 79, 87, 90],\n", " [ 94, 92, 93, ..., 84, 85, 118],\n", " [ 93, 90, 91, ..., 79, 82, 136],\n", " ...,\n", " [3174, 2840, 2626, ..., 2070, 2375, 2466],\n", " [2776, 2905, 2660, ..., 2076, 2284, 2489],\n", " [2516, 2828, 2621, ..., 2437, 2132, 2237]]], dtype=int16)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2015-07-02T11:59:59.999999\n", " * y (y) float64 3kB -3.156e+06 -3.156e+06 ... -3.168e+06 -3.168e+06\n", " * x (x) float64 3kB 2.067e+06 2.067e+06 ... 2.079e+06 2.079e+06\n", " spatial_ref int32 4B 3577\n", "Attributes:\n", " units: 1\n", " nodata: -999\n", " crs: EPSG:3577\n", " grid_mapping: spatial_ref
<xarray.Dataset> Size: 983kB\n", "Dimensions: (time: 1, y: 424, x: 384)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2015-07-02T11:59:59.999999\n", " * y (y) float64 3kB -3.156e+06 -3.156e+06 ... -3.168e+06 -3.168e+06\n", " * x (x) float64 3kB 2.067e+06 2.067e+06 ... 2.079e+06 2.079e+06\n", " spatial_ref int32 4B 3577\n", "Data variables:\n", " nbart_red (time, y, x) int16 326kB 232 235 238 241 ... 376 332 311 322\n", " nbart_green (time, y, x) int16 326kB 510 513 518 524 ... 489 431 363 366\n", " nbart_blue (time, y, x) int16 326kB 469 471 475 480 ... 313 277 257 269\n", "Attributes:\n", " crs: EPSG:3577\n", " grid_mapping: spatial_ref
<xarray.Dataset> Size: 3MB\n", "Dimensions: (time: 1, y: 423, x: 259)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2015-07-02T11:59:59.999999\n", " * y (y) float64 3kB -3.156e+06 -3.156e+06 ... -3.168e+06\n", " * x (x) float64 2kB 2.069e+06 2.069e+06 ... 2.077e+06 2.077e+06\n", " spatial_ref int32 4B 3577\n", "Data variables:\n", " nbart_blue (time, y, x) int16 219kB 462 462 459 456 ... 382 392 375 370\n", " nbart_green (time, y, x) int16 219kB 476 475 470 469 ... 449 453 442 438\n", " nbart_red (time, y, x) int16 219kB 213 211 208 208 ... 221 221 214 209\n", " nbart_nir (time, y, x) int16 219kB 82 82 80 78 78 79 ... 79 77 80 75 75\n", " nbart_swir_1 (time, y, x) int16 219kB 49 48 48 47 45 47 ... 37 35 37 35 34\n", " nbart_swir_2 (time, y, x) int16 219kB 41 40 39 37 36 37 ... 28 27 28 27 28\n", " sdev (time, y, x) float32 438kB 0.003445 0.00351 ... 0.005286\n", " edev (time, y, x) float32 438kB 94.66 94.49 87.37 ... 105.6 111.0\n", " bcdev (time, y, x) float32 438kB 0.07609 0.07581 ... 0.08184 0.08964\n", " count (time, y, x) int16 219kB 17 17 17 17 17 17 ... 12 12 13 12 12\n", "Attributes:\n", " crs: EPSG:3577\n", " grid_mapping: spatial_ref
<xarray.Dataset> Size: 47kB\n", "Dimensions: (time: 1, y: 45, x: 40)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2015-07-02T11:59:59.999999\n", " * y (y) float64 360B 6.958e+06 6.958e+06 ... 6.947e+06 6.947e+06\n", " * x (x) float64 320B 5.296e+05 5.299e+05 ... 5.391e+05 5.394e+05\n", " spatial_ref int32 4B 32756\n", "Data variables:\n", " nbart_blue (time, y, x) int16 4kB 455 445 437 429 424 ... 428 418 393 385\n", " nbart_green (time, y, x) int16 4kB 466 445 425 415 400 ... 488 481 459 448\n", " nbart_red (time, y, x) int16 4kB 206 193 183 176 169 ... 246 242 233 220\n", " nbart_nir (time, y, x) int16 4kB 78 76 77 73 74 75 ... 85 80 86 84 81 81\n", " nbart_swir_1 (time, y, x) int16 4kB 47 44 44 41 41 42 ... 42 38 44 41 38 40\n", " nbart_swir_2 (time, y, x) int16 4kB 39 36 36 33 34 34 ... 34 31 35 32 30 32\n", " sdev (time, y, x) float32 7kB 0.003344 0.003487 ... 0.006524\n", " edev (time, y, x) float32 7kB 96.69 95.62 91.36 ... 120.7 134.8\n", " bcdev (time, y, x) float32 7kB 0.07422 0.07294 ... 0.1174 0.119\n", " count (time, y, x) int16 4kB 17 16 16 16 16 16 ... 13 14 14 13 13 13\n", "Attributes:\n", " crs: EPSG:32756\n", " grid_mapping: spatial_ref
<xarray.Dataset> Size: 63kB\n", "Dimensions: (time: 1, y: 51, x: 47)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2015-07-02T11:59:59.999999\n", " * y (y) float64 408B -3.156e+06 -3.156e+06 ... -3.168e+06\n", " * x (x) float64 376B 2.067e+06 2.068e+06 ... 2.079e+06 2.079e+06\n", " spatial_ref int32 4B 3577\n", "Data variables:\n", " nbart_blue (time, y, x) int16 5kB 476 469 474 478 487 ... 542 354 305 284\n", " nbart_green (time, y, x) int16 5kB 519 514 524 532 542 ... 697 427 478 500\n", " nbart_red (time, y, x) int16 5kB 236 232 241 253 259 ... 610 241 398 367\n", " nbart_nir (time, y, x) int16 5kB 95 89 84 86 86 ... 334 166 1869 2902\n", " nbart_swir_1 (time, y, x) int16 5kB 56 52 47 49 50 ... 76 150 76 918 1262\n", " nbart_swir_2 (time, y, x) int16 5kB 46 42 37 39 40 41 ... 55 98 46 457 585\n", " sdev (time, y, x) float32 10kB 0.002588 0.002426 ... 0.00074\n", " edev (time, y, x) float32 10kB 123.2 112.8 111.7 ... 166.6 190.7\n", " bcdev (time, y, x) float32 10kB 0.09107 0.08658 ... 0.04773 0.02964\n", " count (time, y, x) int16 5kB 16 16 15 15 15 15 ... 11 10 10 10 10 12\n", "Attributes:\n", " crs: EPSG:3577\n", " grid_mapping: spatial_ref
<xarray.Dataset> Size: 47kB\n", "Dimensions: (time: 1, y: 45, x: 40)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2015-07-02T11:59:59.999999\n", " * y (y) float64 360B 6.958e+06 6.958e+06 ... 6.947e+06 6.947e+06\n", " * x (x) float64 320B 5.296e+05 5.299e+05 ... 5.391e+05 5.394e+05\n", " spatial_ref int32 4B 32756\n", "Data variables:\n", " nbart_blue (time, y, x) int16 4kB 458 448 436 428 424 ... 421 420 397 385\n", " nbart_green (time, y, x) int16 4kB 470 451 424 414 401 ... 484 481 462 449\n", " nbart_red (time, y, x) int16 4kB 209 198 182 176 169 ... 251 243 234 221\n", " nbart_nir (time, y, x) int16 4kB 79 78 76 73 74 75 ... 76 77 85 85 81 81\n", " nbart_swir_1 (time, y, x) int16 4kB 47 45 43 41 41 42 ... 36 37 43 42 38 39\n", " nbart_swir_2 (time, y, x) int16 4kB 39 37 35 33 34 34 ... 28 30 34 33 30 31\n", " sdev (time, y, x) float32 7kB 0.003474 0.003705 ... 0.006441\n", " edev (time, y, x) float32 7kB 98.57 97.29 94.86 ... 120.2 131.2\n", " bcdev (time, y, x) float32 7kB 0.07532 0.07276 ... 0.1148 0.117\n", " count (time, y, x) int16 4kB 17 17 16 16 16 16 ... 13 14 14 13 13 13\n", "Attributes:\n", " crs: EPSG:32756\n", " grid_mapping: spatial_ref
<xarray.Dataset> Size: 4MB\n", "Dimensions: (time: 1, y: 424, x: 384)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2015-07-02T11:59:59.999999\n", " * y (y) float64 3kB -3.156e+06 -3.156e+06 ... -3.168e+06\n", " * x (x) float64 3kB 2.067e+06 2.067e+06 ... 2.079e+06 2.079e+06\n", " spatial_ref int32 4B 3577\n", "Data variables:\n", " nbart_blue (time, y, x) int16 326kB 469 471 475 480 ... 313 277 257 269\n", " nbart_green (time, y, x) int16 326kB 510 513 518 524 ... 489 431 363 366\n", " nbart_red (time, y, x) int16 326kB 232 235 238 241 ... 376 332 311 322\n", " nbart_nir (time, y, x) int16 326kB 94 94 95 96 ... 2691 2437 2132 2237\n", " nbart_swir_1 (time, y, x) int16 326kB 56 57 56 57 ... 1432 1177 1018 1059\n", " nbart_swir_2 (time, y, x) int16 326kB 46 46 46 48 47 ... 716 579 490 496\n", " sdev (time, y, x) float32 651kB 0.003487 0.003262 ... 0.001991\n", " edev (time, y, x) float32 651kB 133.7 127.6 125.6 ... 176.0 169.4\n", " bcdev (time, y, x) float32 651kB 0.09439 0.09096 ... 0.0404 0.03889\n", " count (time, y, x) int16 326kB 16 16 16 16 16 15 ... 12 12 12 12 12\n", "Attributes:\n", " crs: EPSG:3577\n", " grid_mapping: spatial_ref
<xarray.Dataset> Size: 47kB\n", "Dimensions: (time: 1, y: 45, x: 40)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2015-07-02T11:59:59.999999\n", " * y (y) float64 360B 6.958e+06 6.958e+06 ... 6.947e+06 6.947e+06\n", " * x (x) float64 320B 5.296e+05 5.299e+05 ... 5.391e+05 5.394e+05\n", " spatial_ref int32 4B 32756\n", "Data variables:\n", " nbart_blue (time, y, x) int16 4kB 455 445 437 429 424 ... 428 418 393 385\n", " nbart_green (time, y, x) int16 4kB 466 445 425 415 400 ... 488 481 459 448\n", " nbart_red (time, y, x) int16 4kB 206 193 183 176 169 ... 246 242 233 220\n", " nbart_nir (time, y, x) int16 4kB 78 76 77 73 74 75 ... 85 80 86 84 81 81\n", " nbart_swir_1 (time, y, x) int16 4kB 47 44 44 41 41 42 ... 42 38 44 41 38 40\n", " nbart_swir_2 (time, y, x) int16 4kB 39 36 36 33 34 34 ... 34 31 35 32 30 32\n", " sdev (time, y, x) float32 7kB 0.003344 0.003487 ... 0.006524\n", " edev (time, y, x) float32 7kB 96.69 95.62 91.36 ... 120.7 134.8\n", " bcdev (time, y, x) float32 7kB 0.07422 0.07294 ... 0.1174 0.119\n", " count (time, y, x) int16 4kB 17 16 16 16 16 16 ... 13 14 14 13 13 13\n", "Attributes:\n", " crs: EPSG:32756\n", " grid_mapping: spatial_ref
<xarray.Dataset> Size: 47kB\n", "Dimensions: (time: 1, y: 45, x: 40)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2015-07-02T11:59:59.999999\n", " * y (y) float64 360B 6.958e+06 6.958e+06 ... 6.947e+06 6.947e+06\n", " * x (x) float64 320B 5.296e+05 5.299e+05 ... 5.391e+05 5.394e+05\n", " spatial_ref int32 4B 32756\n", "Data variables:\n", " nbart_blue (time, y, x) int16 4kB 439 411 388 386 394 ... 352 351 337 333\n", " nbart_green (time, y, x) int16 4kB 481 450 414 407 407 ... 454 450 442 419\n", " nbart_red (time, y, x) int16 4kB 248 225 206 204 205 ... 268 260 255 238\n", " nbart_nir (time, y, x) int16 4kB 150 137 134 141 142 ... 122 124 130 135\n", " nbart_swir_1 (time, y, x) int16 4kB 79 76 77 79 84 80 ... 84 80 63 63 68 75\n", " nbart_swir_2 (time, y, x) int16 4kB 67 65 64 71 70 67 ... 73 71 51 53 53 65\n", " sdev (time, y, x) float32 7kB 0.006507 0.007285 ... 0.01002\n", " edev (time, y, x) float32 7kB 249.9 212.5 171.2 ... 145.0 145.2\n", " bcdev (time, y, x) float32 7kB 0.2078 0.2035 ... 0.1341 0.1357\n", " count (time, y, x) int16 4kB 12 12 12 12 12 12 ... 11 11 9 9 10 10\n", "Attributes:\n", " crs: EPSG:32756\n", " grid_mapping: spatial_ref
<xarray.Dataset> Size: 47kB\n", "Dimensions: (time: 1, y: 45, x: 40)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2015-07-02T11:59:59.999999\n", " * y (y) float64 360B 6.958e+06 6.958e+06 ... 6.947e+06 6.947e+06\n", " * x (x) float64 320B 5.296e+05 5.299e+05 ... 5.391e+05 5.394e+05\n", " spatial_ref int32 4B 32756\n", "Data variables:\n", " nbart_blue (time, y, x) int16 4kB 439 411 388 386 394 ... 352 351 337 333\n", " nbart_green (time, y, x) int16 4kB 481 450 414 407 407 ... 454 450 442 419\n", " nbart_red (time, y, x) int16 4kB 248 225 206 204 205 ... 268 260 255 238\n", " nbart_nir (time, y, x) int16 4kB 150 137 134 141 142 ... 122 124 130 135\n", " nbart_swir_1 (time, y, x) int16 4kB 79 76 77 79 84 80 ... 84 80 63 63 68 75\n", " nbart_swir_2 (time, y, x) int16 4kB 67 65 64 71 70 67 ... 73 71 51 53 53 65\n", " sdev (time, y, x) float32 7kB 0.006507 0.007285 ... 0.01002\n", " edev (time, y, x) float32 7kB 249.9 212.5 171.2 ... 145.0 145.2\n", " bcdev (time, y, x) float32 7kB 0.2078 0.2035 ... 0.1341 0.1357\n", " count (time, y, x) int16 4kB 12 12 12 12 12 12 ... 11 11 9 9 10 10\n", "Attributes:\n", " crs: PROJCS["WGS 84 / UTM zone 56S",GEOGCS["WGS 84",DATUM["WGS_...\n", " grid_mapping: spatial_ref
<xarray.Dataset> Size: 21MB\n", "Dimensions: (time: 5, y: 424, x: 384)\n", "Coordinates:\n", " * time (time) datetime64[ns] 40B 2013-07-02T11:59:59.999999 ... 20...\n", " * y (y) float64 3kB -3.156e+06 -3.156e+06 ... -3.168e+06\n", " * x (x) float64 3kB 2.067e+06 2.067e+06 ... 2.079e+06 2.079e+06\n", " spatial_ref int32 4B 3577\n", "Data variables:\n", " nbart_blue (time, y, x) int16 2MB 473 474 473 469 470 ... 329 302 287 284\n", " nbart_green (time, y, x) int16 2MB 522 525 524 519 524 ... 505 437 383 384\n", " nbart_red (time, y, x) int16 2MB 241 246 245 242 247 ... 383 342 329 333\n", " nbart_nir (time, y, x) int16 2MB 85 89 86 82 83 ... 2839 2514 2333 2412\n", " nbart_swir_1 (time, y, x) int16 2MB 45 48 48 43 46 ... 1366 1099 992 1079\n", " nbart_swir_2 (time, y, x) int16 2MB 36 38 37 34 35 ... 667 636 509 455 487\n", " sdev (time, y, x) float32 3MB 0.003244 0.003646 ... 0.00157\n", " edev (time, y, x) float32 3MB 114.4 108.7 109.3 ... 210.3 218.2\n", " bcdev (time, y, x) float32 3MB 0.09557 0.08731 ... 0.04308 0.04944\n", " count (time, y, x) int16 2MB 9 9 9 9 9 9 9 ... 14 14 14 13 13 13 13\n", "Attributes:\n", " crs: EPSG:3577\n", " grid_mapping: spatial_ref