Jupyter

Interactive Notebooks for Python, Julia, and R

Matthew DeHaven

April 6, 2024

Course Home Page

Jupyter Background

Developed in 2014.

It’s name is a reference to the three core programming languages it supports:

  • Julia
  • Python
  • R

Today now supports many more languages (SQL, Ruby, …).

Jupyter Products

Jupyter Notebooks

  • Interactive notebooks with code + markdown for many languages
  • Browser based editor

JupyterLab

  • Newer improved browser based editor for notebooks

JupyterHub

  • Cloud-based jupyter notebooks

Setup and Installation

Jupyter Notebooks

Requires:

  • Python installation
  • Python package: pip install jupyter

To use in VS Code, you need the “Jupyter” extension.

Using an Environment

I recommend setting up a “.venv” environment before installing jupyter.

In VS Code,

  • Command Palette > “Python: Create Environment…”
  • Select “venv”
  • then pip install jupyter in the terminal

Creating a Notebook

Jupyter notebooks have a unique extension: “.ipynb”

  • “interactive python notebook”

Simply create an empty file with that extension and VS Code will recognize it as a Jupyter notebook.

Kernels

Kernels

Jupyter Notebooks execute code by sending it to one of many possible “kernels”.

You can choose as your kernel:

  • Python
  • Julia
  • R
  • other languages you set up.

Using a language as a kernel requies some setup for each language.

Python Kernel

Python can be used as a Kernel once the jupyter package is installed.

I recommend using your python environment “.venv” as your kernel.

  • should be where you installed jupyter
  • keeps your packages self-contained

Julia Kernel

To use Julia as a kernel, you first need to install

  • IJulia package

Here it is probably easiest to just install Ijulia system-wide

R Kernel

To use R as a kernel, you first need to install

  • IRkernel package

Here it is probably easiest to just install IRkernel system-wide.

You should also run the followin in R to finish the setup: IRkernel::installspec()

Choosing a Kernel

Whenever you open a Jupyter Notebook you will be able to choose the kernel you want to use.

  • Your choice will be saved
  • You can always change kernels later
    • though that would probably break your code

Notebook Cells

Notebook Cells

Jupyter notebooks have two types of cells:

  • Markdown Cells
  • Code Cells

Markdown Cells

Markdown cells allow you to write and render markdown.

You can actually do this without any kernel attached.

All of the usual markdown formatting is allowed (headers, links, bullets, etc.)

Code Cells

Code cells are where you write code.

Each code cell can be execute individually.

Output, errors, and warnings are displayed after the individual code cell.

The .ipynb files

The “interactive python notebook” files are actually just JSON files.

JSON is a common file format.

  • stores data as arrays and key:value pairs

You can always open up a Jupyter notebook with a basic text editor.

  • You will be able to see each “cell”
  • But it will be messy

Output Included

A key feature of Jupyter notebooks is

  • the output of code cells are included in the JSON file

This means

  • you can send your file to someone else, they can open it, and see your results, without having to run the notebook
  • the notebook files can get very large
  • git diffs are a big mess

Editing and Running Jupyter Notebooks

Two options for editing Jupyter Notebooks:

  1. in the browser

  2. in VS Code

Editing Jupyter Notebooks in the browser

This is the default built in to the jupyter package.

In your workspace run:

terminal
jupyter notebook

This will launch a http server in the terminal.

  • This terminal must stay open while you are using Jupyter!

And it will open a window in your browser with the editor.

Editing Jupyter Notebooks in VS Code

Once you have installed the Jupyter VS Code extension

  • you can edit and run Jupyter Notebooks witin VS Code

Behind the scenes, VS Code will launch the kernel as a http server, send the code to it, bring back the results.

Quarto

Quarto vs. Jupyter Notebooks

Both have code and markdown “chunks” (cells).

Jupyter Notebooks

  • focused on interactivitiy
  • outputs a JSON file with markdown + code + output

Quarto

  • focused on output decoument types: html, pdf, slides, etc.
  • “.qmd” files are markdown + code only

Compile Jupyter Notebooks in Quarto

Quarto can compile a Jupyter notebook into any of is output formats.

terminal
quarto render example.ipynb --to html
quarto render example.ipynb --to docx

This allows you to quickly turn your Jupyter Notebooks into pdf reports, or website pages, etc.

Use Jupyter as a Quarto Engine - Python

Instead, you can use Jupyter as the engine for Quarto (instead of R).

---
format: pdf
jupyter: python3
---

Some text

```{python}
x = 2 + 
print(x)
```

## A Header

Some text

Use Jupyter as a Quarto Engine - Julia

You can also switch to using the Julia kernel for Jupyter.

---
format: pdf
jupyter: julia-1.10
---

Some text

```{julia}
x = 2 + 
println(x)
```

## A Header

Some text

Interactive Notebooks

List of Interactive Notebooks

  • rmarkdown R
  • Pluto.jl Julia
  • Quarto Julia, Python, R
  • Jupyter Notebooks Julia, Python, R

Jupyter’s big difference: output is included in the notebook file.

Interactive Notebook Pros

  • Documentation / thoughts right next to code
  • Make reports / slides / exciting output
  • Easy to share the output with others
  • Documents are never out of sync with the code

Interactive Notebook Cons

  • Often harder to maintain code environments
  • Introducies a lot of dependencies
    • can be hard for others to run your code
  • Encourages “single file” linear coding
    • rather than separate scripts and functions
  • Harder to run unit tests, debuggers, other software engineering tools

When to use Interactive Notebooks?

I love to use Quarto (and before that rmarkdown).

  • websites and presentations for class or Macro Breakfast
  • trying out new ideas for my research projects
    • I make PDFs of the results for my advisor and I to look at

But I think your research project should not be in a notebook.

  • You want to be able to run your project end-to-end with only the necessary dependencies
  • You want others to be able to run it easily
  • You want to be able to test/debug/optimize the code

Jupyter Summary

  • Interactive notebook
    • kernels for Julia, Python, R, and more
  • Just a JSON file
    • Output is saved in the file
  • Edit in
    • VS Code
    • Browser
  • Can use as a Quarto engine
    • supports Python and Julia in Quarto