Introduction to Jupyter notebooks

Sign up to the DEA Sandbox to run this notebook interactively from a browser
Compatibility: Notebook currently compatible with both the NCI and DEA Sandbox environments
Prerequisites:
- There is no prerequisite learning required, as this document is designed for a novice user of the Jupyter environment

Background

Access to implementations of the Open Data Cube such as Digital Earth Australia and Digital Earth Africa is achieved through the use of Python code and Jupyter Notebooks. The Jupyter Notebook (also termed notebook from here onwards) is an interactive web application that allows for the viewing, creation and documentation of live code. Notebook applications include data transformation, visualisation, modelling and machine learning. The default web interface to access notebooks when using either the National Computational Infrastructure (NCI) or the DEA Sandbox is JupyterLab.

Description

This notebook is designed to introduce users to the basics of using Python code in Jupyter Notebooks via JupyterLab.

Topics covered include:

How to run (execute) a Jupyter Notebook cell
The different types of Jupyter Notebook cells
Stopping a process or restarting a Jupyter Notebook
Saving and exporting your work
Starting a new Jupyter Notebook

Getting started

Running (executing) a cell

Jupyter Notebooks allow code to be separated into sections that can be executed independent of one another. These sections are called “cells”.

Python code is written into individual cells that can be executed by placing the cursor in the cell and typing Shift-Enter on the keyboard or selecting the ► “Run the selected cells and advance” button in the ribbon at the top of the notebook. These options will run a single cell at a time.

To automatically run all cells in a notebook, navigate to the “Run” tab of the menu bar at the top of JupyterLab and select “Run All Cells” (or the option that best suits your needs). When a cell is run, the cell’s content is executed. Any output produced from running the cell will appear directly below it.

Run the cell below:

[ ]:

print("I ran a cell!")

Cell status

The [ ]: symbol to the left of each Code cell describes the state of the cell:

[ ]: means that the cell has not been run yet.
[*]: means that the cell is currently running.
[1]: means that the cell has finished running and was the first cell run.

The number indicates the order that the cells were run in.

Note: To check whether a cell is currently executing in a Jupyter notebook, inspect the small circle in the top-right of the window. The circle will turn grey (“Kernel busy”) when the cell is running, and return to empty (“Kernel idle”) when the process is complete.

Jupyter notebook cell types

Jupyter notebooks can include Code and Markdown cells. This designation can be changed using the ribbon at the top of the notebook.

Code cells

All code operations are performed in Code cells. Code cells can be used to edit and write new code, and perform tasks like loading data, plotting data and running analyses.

Click on the cell below. Note that the ribbon at the top of the notebook describes it as a Code cell.

[ ]:

print("This is a code cell")

Markdown cells

Place the cursor in this cell by double clicking.

The cell format has changed to allow for editing. Note that the ribbon at the top of the notebook describes this as a Markdown cell.

Run this cell to return the formatted version.

Markdown cells provide the narrative to a notebook. They are used for text and are useful to describe the code operations in the following cells. To see some of the formatting options for text in a Markdown cell, navigate to the “Help” tab of the menu bar at the top of JupyterLab and select “Markdown Reference”. Here you will see a wide range of text formatting options including headings, dot points, italics, hyperlinking and creating tables.

Stopping a process or restarting a Jupyter Notebook

Sometimes it can be useful to stop a cell execution before it finishes (e.g. if a process is taking too long to complete, or if the code needs to be modified before running the cell). To interrupt a cell execution, click the ■ “stop” button (“Interrupt the kernel”) in the ribbon above the notebook.

To test this, run the following code cell. This will run a piece of code that will take 10 seconds to complete. To interrupt this code, press the ■ “stop” button. The notebook should stop executing the cell.

[3]:

import time
time.sleep(10)

If the approach above does not work (e.g. if the notebook has frozen or refuses to respond), try restarting the entire notebook. To do this, navigate to the “Kernel” tab of the menu bar, then select “Restart Kernel”. Alternatively, click the ↻ “Restart the kernel” button in the ribbon above the notebook.

Restarting a notebook can also be useful for testing whether code will work correctly the first time a new user tries to run the notebook. To restart and then run every cell in a notebook, navigate to the “Kernel” tab, then select “Restart and Run All Cells”.

Saving and exporting your work

Modifications to Jupyter Notebooks are automatically saved every few minutes. To actively save the notebook, navigate to “File” in the menu bar, then select “Save Notebook”. Alternatively, click the 💾 “save” icon on the left of the ribbon above the notebook.

Starting a new notebook

To create a new notebook, use JupyterLab’s file browser to navigate to the directory you would like the notebook to be created in (if the file browser is not visible, re-open it by clicking on the 📁 “File browser” icon at the top-left of the screen).

Once you have navigated to the desired location, press the ✚ “New Launcher” button above the browser. This will bring up JupyterLab’s “Launcher” page which allows you to launch a range of new files or utilities. Below the heading “Notebook”, click the large “Python 3” button. This will create a new notebook entitled “Untitled.ipynb” in the chosen directory.

To rename this notebook to something more useful, right-click on it in the file browser and select “Rename”.

Note: The dea-notebooks repository provides a template notebook containing a consistent structure and style that is recommended for all DEA Jupyter notebooks. To use this template rather than start a notebook from scratch, click this link to open the “DEA_notebooks_template.ipynb” notebook, then click “File” and “Save Notebook As…” to create a copy of the template in your desired location.

Recommended next steps

For more advanced information about working with Jupyter Notebooks or JupyterLab, see the JupyterLab documentation.

To continue working through the notebooks in this beginner’s guide, the following notebooks are designed to be worked through in the following order:

Once you have worked through the beginner’s guide, you can join advanced users by exploring:

The “DEA_products” directory in the repository, where you can explore DEA products in depth.
The “How_to_guides” directory, which contains a recipe book of common techniques and methods for analysing DEA data.
The “Real_world_examples” directory, which provides more complex workflows and analysis case studies.

Additional information

License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Australia data is licensed under the Creative Commons by Attribution 4.0 license.

Contact: If you need assistance, please post a question on the Open Data Cube Discord chat or on the GIS Stack Exchange using the open-data-cube tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on GitHub.

Last modified: December 2023