Introduction to Jupyter notebooks

  • Compatability: Notebook currently compatible with both the NCI and DEA Sandbox environments

  • Prerequisites:

    • There is no prerequisite learning required, as this document is designed for a novice user of the Jupyter environment

Background

Access to implementations of the Open Data Cube such as Digital Earth Australia and Digital Earth Africa is achieved through the use of Python code and Jupyter Notebooks. The Jupyter Notebook (also termed notebook from here onwards) is an interactive web application that allows for the viewing, creation and documentation of live code. Notebook applications include data transformation, visualisation, modelling and machine learning. The default web interface to access notebooks when using either the National Computational Infrastructure (NCI) or the DEA Sandbox is JupyterLab.

Description

This notebook is designed to introduce users to the basics of using Python code in Jupyter Notebooks via JupyterLab.

Topics covered include:

  1. How to run (execute) a Jupyter Notebook cell

  2. The different types of Jupyter Notebook cells

  3. Stopping a process or restarting a Jupyter Notebook

  4. Saving and exporting your work

  5. Starting a new Jupyter Notebook


Getting started

Running (executing) a cell

Jupyter Notebooks allow code to be separated into sections that can be executed independent of one another. These sections are called “cells”.

Python code is written into individual cells that can be executed by placing the cursor in the cell and typing Shift-Enter on the keyboard or selecting the ► “Run the selected cells and advance” button in the ribbon at the top of the notebook. These options will run a single cell at a time.

To automatically run all cells in a notebook, navigate to the “Run” tab of the menu bar at the top of JupyterLab and select “Run All Cells” (or the option that best suits your needs). When a cell is run, the cell’s content is executed. Any output produced from running the cell will appear directly below it.

Run the cell below:

[1]:
print("I ran a cell!")
I ran a cell!

Cell status

The [ ]: symbol to the left of each Code cell describes the state of the cell:

  • [ ]: means that the cell has not been run yet.

  • [*]: means that the cell is currently running.

  • [1]: means that the cell has finished running and was the first cell run.

The number indicates the order that the cells were run in.

Note: To check whether a cell is currently executing in a Jupyter notebook, inspect the small circle in the top-right of the window. The circle will turn grey (“Kernel busy”) when the cell is running, and return to empty (“Kernel idle”) when the process is complete.

Jupyter notebook cell types

Cells are identified as either Code, Markdown, or Raw. This designation can be changed using the ribbon at the top of the notebook.

Code cells

All code operations are performed in Code cells. Code cells can be used to edit and write new code, and perform tasks like loading data, plotting data and running analyses.

Click on the cell below. Note that the ribbon at the top of the notebook describes it as a Code cell.

[2]:
print("This is a code cell")
This is a code cell

Markdown cells

Place the cursor in this cell by double clicking.

The cell format has changed to allow for editing. Note that the ribbon at the top of the notebook describes this as a Markdown cell.

Run this cell to return the formatted version.

Markdown cells provide the narrative to a notebook. They are used for text and are useful to describe the code operations in the following cells. To see some of the formatting options for text in a Markdown cell, navigate to the “Help” tab of the menu bar at the top of JupyterLab and select “Markdown Reference”. Here you will see a wide range of text formatting options including headings, dot points, italics, hyperlinking and creating tables.

Raw cells

Information in Raw cells is stored in the notebook metadata and can be used to render different code formats into HTML or \(\LaTeX\). There are a range of available Raw cell formats that differ depending on how they are to be rendered. For the purposes of this beginner’s guide, raw cells are rarely used by the authors and not required for most notebook users.

There is a Raw cell associated with the Tags section of this notebook below. As this cell is in the “ReStructured Text” format, its contents are not visible nor are they executed in any way. This cell is used by the authors to store information tags in the metadata that is relevant to the notebook, and create an index of tags on the Digital Earth Australia user guide.

Stopping a process or restarting a Jupyter Notebook

Sometimes it can be useful to stop a cell execution before it finishes (e.g. if a process is taking too long to complete, or if the code needs to be modified before running the cell). To interrupt a cell execution, click the ■ “stop” button (“Interrupt the kernel”) in the ribbon above the notebook.

To test this, run the following code cell. This will run a piece of code that will take 20 seconds to complete. To interrupt this code, press the ■ “stop” button. The notebook should stop executing the cell.

[3]:
import time
time.sleep(20)

If the approach above does not work (e.g. if the notebook has frozen or refuses to respond), try restarting the entire notebook. To do this, navigate to the “Kernel” tab of the menu bar, then select “Restart Kernel”. Alternatively, click the ↻ “Restart the kernel” button in the ribbon above the notebook.

Restarting a notebook can also be useful for testing whether code will work correctly the first time a new user tries to run the notebook. To restart and then run every cell in a notebook, navigate to the “Kernel” tab, then select “Restart and Run All Cells”.

Saving and exporting your work

Modifications to Jupyter Notebooks are automatically saved every few minutes. To actively save the notebook, navigate to “File” in the menu bar, then select “Save Notebook”. Alternatively, click the 💾 “save” icon on the left of the ribbon above the notebook.

Exporting Jupyter Notebooks to Python scripts

The standard file extension for a Jupyter Notebook is .ipynb.

There are a range of export options that allow you to save your work for access outside of the Jupyter environment. For example, Python code can easily be saved as .py Python scripts by navigating to the “File” tab of the menu bar in JupyterLab and selecting “Export Notebook As” followed by “Export Notebook To Executable Script”.

Starting a new notebook

To create a new notebook, use JupyterLab’s file browser to navigate to the directory you would like the notebook to be created in (if the file browser is not visible, re-open it by clicking on the 📁 “File browser” icon at the top-left of the screen).

Once you have navigated to the desired location, press the ✚ “New Launcher” button above the browser. This will bring up JupyterLab’s “Launcher” page which allows you to launch a range of new files or utilities. Below the heading “Notebook”, click the large “Python 3” button. This will create a new notebook entitled “Untitled.ipynb” in the chosen directory.

To rename this notebook to something more useful, right-click on it in the file browser and select “Rename”.

Note: The dea-notebooks repository provides a template notebook containing a consistent structure and style that is recommended for all DEA Jupyter notebooks. To use this template rather than start a notebook from scratch, click this link to open the “DEA_notebooks_template.ipynb” notebook, then click “File” and “Save Notebook As…” to create a copy of the template in your desired location.


Additional information

License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Australia data is licensed under the Creative Commons by Attribution 4.0 license.

Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the open-data-cube tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on Github.

Last modified: December 2019

Tags

Browse all available tags on the DEA User Guide’s Tags Index

Tags: sandbox compatible, NCI compatible, beginner, export, jupyterlab, jupyter notebook, code cells, markdown cells, raw cells