Hello Class!
False
Crash course in Python and Jupyter Notebooks
March 31, 2025
.
operator has multiple meanings
Python comes with many operating systems, but to get the latest version…
You want to always have the latest version, but this leads to many installations of Python on the same computer.
Each of these installations are referred to as “interpreters”.
VS Code allows you to choose a Python Interpreter when editing a “.py” file.
You can run a whole file of Python code by clicking the play arrow.
Or you can run a single line or selection…
Shift+Enter
Both of these will execute the Python code in a terminal.
There are two main options for setting up Python environments:
venv
Comes with the latest Python installations, similar to renv
for R, creates a folder with symlinks to the packages.
conda
Part of the Anaconda/miniconda world.
Can be used both as a package manager and for environments.
Conda manages both Python installations, packages, and environments from outside Python.
Venv manages Python environments from within Python.
Anaconda is a distribution of (1) a Python installation, (2) Conda environments, (3) a bunch of default packages.
Miniconda is a distribution of (1) a Python installation and (2) Conda environments.
You should be using environments for any language, but especially for Python.
For venv
the command to create a virtual environment is…
Luckily VS Code’s Python Extension makes handling these environments easy.
Open the Command Palette Shift+Cmd+P
Search for “Python: Create Environment…”
Select either “venv” or “conda”
Select the Python interpreter (version) to use
Now whenever you launch a terminal for this workspace, it will use the environment you created.
Packages are how you can import functions.
You will sometimes see “Modules”. A package could have one or many modules within it.
Packages can be installed using
pip
built in to Pythonconda
pip
stands for “pip installs packages”.
It installs Python packages hosted on the Python Package Index (PyPI).
pip install numpy
is executed in a terminal, not in Python code itself, unlike R or Julia.
Installs packages hosted on the Anaconda repository.
Can also install other software, like R.
You can also use pip install
to install packages in a conda
environment, but this can cause conflicts, so you should use conda install
by default in this situation.
Once a package is installed (to your environment), you can…
numpy
is a package for numerical computation (ex. better arrays and linear algebra).
The first option is what is recommended and used most often.
Python uses a single =
for assignment
Math is not that different:
Logic operators are written out instead of symbols.
Lists are constructed as comma-separated elements in square brackets
Tuples are constructed as comma-separated elements in parentheses.
Tuples are immutable; lists are mutable.
Sets are constructed as elements in curly braces.
Dictionaries are constructed as key:value
pairs in curly braces.
{'RI': 'Rhode Island', 'MA': 'Massachusetts', 'VT': 'Vermont'}
Python starts indexing from 0.
For some people, this is the mark of a true programming language.
One of the most common errors when switching between R and Python.
Python is space sensitive.
For example, a for loop requires the looped lines to be offset by at least one space (customary to use a tab—4 spaces).
Python is an object-oriented programming language.
Example: a list is an object.
Objects have
Classes define objects; objects are the actual instance of the class.
Methods are a key feature of Python.
Methods are functions attached to an object that operate on the object.
Remember: methods are a type of function.
You can think of methods as functions that always take as an input the object they are defined for.
They may take other inputs as well.
Every object has their own methods. The sort()
method is not defined for tuples.
If you wanted to append an element to a list…
append()
clear()
copy()
count()
extend()
index()
insert()
pop()
remove()
reverse()
sort()
Some functions that you would expect to be methods are not.
Example, len()
returns the length of an object.
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[33], line 2 1 x = [1, 2, 3] ----> 2 x.len() AttributeError: 'list' object has no attribute 'len'
But len()
is not a method of a list, even though you might expect it.
Python functions are defined with the keyword def
and spacing:
Pandas
is a package that implements DataFrames in Python.
First, you need to install pandas
, then import it.
The construction of a DataFrame is based off of the dictionary objects.
Each column of a DataFrame is a Series.
Dataframes have a lot of their own methods.
describe()
will summarize all numerical columns.
A crucial step is reading and writing data.
Writing it out is a method:
All of the data science operations we saw with dplyr
and data.table
are possible with Python and pandas
.
numpy
vectors, arrays, numerical analysispandas
DataFramesmatplotlib
plottingseaborn
plotting with pandas DataFramesplotly
interactive plotsScikit-Learn
machine learningTensorFlow
Neural NetsPyTorch
Neural Nets, but using GPUsBeautifulSoup
web scrapingnumpy
packageDeveloped in 2014.
It’s name is a reference to the three core programming languages it supports:
Today now supports many more languages (SQL, Ruby, …).
Jupyter Notebooks
JupyterLab
JupyterHub
Requires:
pip install jupyter
To use in VS Code, you need the “Jupyter” extension.
If you are using a “.venv” environment, you’ll have to install jupyter in that environment.
In VS Code,
pip install jupyter
in the terminalJupyter notebooks have a unique extension: “.ipynb”
Simply create an empty file with that extension and VS Code will recognize it as a Jupyter notebook.
Jupyter Notebooks execute code by sending it to one of many possible “kernels”.
You can choose as your kernel:
Using a language as a kernel requies some setup for each language.
Python can be used as a Kernel once the jupyter
package is installed.
I recommend using your python environment “.venv” as your kernel.
To use Julia as a kernel, you first need to install
IJulia
packageHere it is probably easiest to just install IJulia
system-wide
To use R as a kernel, you first need to install
IRkernel
packageHere it is probably easiest to just install IRkernel
system-wide.
You should also run the following in R to finish the setup: IRkernel::installspec()
Whenever you open a Jupyter Notebook you will be able to choose the kernel you want to use.
Jupyter notebooks have two types of cells:
Markdown cells allow you to write and render markdown.
You can actually do this without any kernel attached.
All of the usual markdown formatting is allowed (headers, links, bullets, etc.)
Code cells are where you write code.
Each code cell can be execute individually.
Output, errors, and warnings are displayed after the individual code cell.
.ipynb
filesThe “interactive python notebook” files are actually just JSON files.
JSON is a common file format.
You can always open up a Jupyter notebook with a basic text editor.
A key feature of Jupyter notebooks is
This means
Two options for editing Jupyter Notebooks:
in the browser
in VS Code
This is the default built in to the jupyter
package.
In your workspace run:
This will launch a http server in the terminal.
And it will open a window in your browser with the editor.
Once you have installed the Jupyter VS Code extension
Behind the scenes, VS Code will launch the kernel as a http server, send the code to it, bring back the results.
Both have code and markdown “chunks” (cells).
Jupyter Notebooks
Quarto
Quarto can compile a Jupyter notebook into any of is output formats.
This allows you to quickly turn your Jupyter Notebooks into pdf reports, or website pages, etc.
Instead, you can use Jupyter as the engine for Quarto (instead of R).
rmarkdown
RPluto.jl
JuliaQuarto
Julia, Python, RJupyter Notebooks
Julia, Python, RJupyter’s big difference: output is included in the notebook file.
I love to use Quarto (and before that rmarkdown).
But I think your research project should not be in a notebook.
.
operator has a multiple meanings
jupyter notebook