Hello Class!
False
Crash course in Python and Jupyter Notebooks
March 31, 2025
. operator has multiple meanings
Python comes with many operating systems, but to get the latest version…
You want to always have the latest version, but this leads to many installations of Python on the same computer.
Each of these installations are referred to as “interpreters”.
VS Code allows you to choose a Python Interpreter when editing a “.py” file.
You can run a whole file of Python code by clicking the play arrow.
Or you can run a single line or selection…
Shift+EnterBoth of these will execute the Python code in a terminal.
There are two main options for setting up Python environments:
venvComes with the latest Python installations, similar to renv for R, creates a folder with symlinks to the packages.
condaPart of the Anaconda/miniconda world.
Can be used both as a package manager and for environments.
Conda manages both Python installations, packages, and environments from outside Python.
Venv manages Python environments from within Python.
Anaconda is a distribution of (1) a Python installation, (2) Conda environments, (3) a bunch of default packages.
Miniconda is a distribution of (1) a Python installation and (2) Conda environments.
You should be using environments for any language, but especially for Python.
For venv the command to create a virtual environment is…
Luckily VS Code’s Python Extension makes handling these environments easy.
Open the Command Palette Shift+Cmd+P
Search for “Python: Create Environment…”
Select either “venv” or “conda”
Select the Python interpreter (version) to use
Now whenever you launch a terminal for this workspace, it will use the environment you created.
Packages are how you can import functions.
You will sometimes see “Modules”. A package could have one or many modules within it.
Packages can be installed using
pip built in to Pythoncondapip stands for “pip installs packages”.
It installs Python packages hosted on the Python Package Index (PyPI).
pip install numpy is executed in a terminal, not in Python code itself, unlike R or Julia.
Installs packages hosted on the Anaconda repository.
Can also install other software, like R.
You can also use pip install to install packages in a conda environment, but this can cause conflicts, so you should use conda install by default in this situation.
Once a package is installed (to your environment), you can…
numpy is a package for numerical computation (ex. better arrays and linear algebra).
The first option is what is recommended and used most often.
Python uses a single = for assignment
Math is not that different:
Logic operators are written out instead of symbols.
Lists are constructed as comma-separated elements in square brackets
Tuples are constructed as comma-separated elements in parentheses.
Tuples are immutable; lists are mutable.
Sets are constructed as elements in curly braces.
Dictionaries are constructed as key:value pairs in curly braces.
{'RI': 'Rhode Island', 'MA': 'Massachusetts', 'VT': 'Vermont'}
Python starts indexing from 0.
For some people, this is the mark of a true programming language.
One of the most common errors when switching between R and Python.
Python is space sensitive.
For example, a for loop requires the looped lines to be offset by at least one space (customary to use a tab—4 spaces).
Python is an object-oriented programming language.
Example: a list is an object.
Objects have
Classes define objects; objects are the actual instance of the class.
Methods are a key feature of Python.
Methods are functions attached to an object that operate on the object.
Remember: methods are a type of function.
You can think of methods as functions that always take as an input the object they are defined for.
They may take other inputs as well.
Every object has their own methods. The sort() method is not defined for tuples.
If you wanted to append an element to a list…
append()clear()copy()count()extend()index()insert()pop()remove()reverse()sort()Some functions that you would expect to be methods are not.
Example, len() returns the length of an object.
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[33], line 2 1 x = [1, 2, 3] ----> 2 x.len() AttributeError: 'list' object has no attribute 'len'
But len() is not a method of a list, even though you might expect it.
Python functions are defined with the keyword def and spacing:
Pandas is a package that implements DataFrames in Python.
First, you need to install pandas, then import it.
The construction of a DataFrame is based off of the dictionary objects.
Each column of a DataFrame is a Series.
Dataframes have a lot of their own methods.
describe() will summarize all numerical columns.
A crucial step is reading and writing data.
Writing it out is a method:
All of the data science operations we saw with dplyr and data.table are possible with Python and pandas.
numpy vectors, arrays, numerical analysispandas DataFramesmatplotlib plottingseaborn plotting with pandas DataFramesplotly interactive plotsScikit-Learn machine learningTensorFlow Neural NetsPyTorch Neural Nets, but using GPUsBeautifulSoup web scrapingnumpy packageDeveloped in 2014.
It’s name is a reference to the three core programming languages it supports:
Today now supports many more languages (SQL, Ruby, …).
Jupyter Notebooks
JupyterLab
JupyterHub
Requires:
pip install jupyterTo use in VS Code, you need the “Jupyter” extension.
If you are using a “.venv” environment, you’ll have to install jupyter in that environment.
In VS Code,
pip install jupyter in the terminalJupyter notebooks have a unique extension: “.ipynb”
Simply create an empty file with that extension and VS Code will recognize it as a Jupyter notebook.
Jupyter Notebooks execute code by sending it to one of many possible “kernels”.
You can choose as your kernel:
Using a language as a kernel requies some setup for each language.
Python can be used as a Kernel once the jupyter package is installed.
I recommend using your python environment “.venv” as your kernel.
To use Julia as a kernel, you first need to install
IJulia packageHere it is probably easiest to just install IJulia system-wide
To use R as a kernel, you first need to install
IRkernel packageHere it is probably easiest to just install IRkernel system-wide.
You should also run the following in R to finish the setup: IRkernel::installspec()
Whenever you open a Jupyter Notebook you will be able to choose the kernel you want to use.
Jupyter notebooks have two types of cells:
Markdown cells allow you to write and render markdown.
You can actually do this without any kernel attached.
All of the usual markdown formatting is allowed (headers, links, bullets, etc.)
Code cells are where you write code.
Each code cell can be execute individually.
Output, errors, and warnings are displayed after the individual code cell.
.ipynb filesThe “interactive python notebook” files are actually just JSON files.
JSON is a common file format.
You can always open up a Jupyter notebook with a basic text editor.
A key feature of Jupyter notebooks is
This means
Two options for editing Jupyter Notebooks:
in the browser
in VS Code
This is the default built in to the jupyter package.
In your workspace run:
This will launch a http server in the terminal.
And it will open a window in your browser with the editor.
Once you have installed the Jupyter VS Code extension
Behind the scenes, VS Code will launch the kernel as a http server, send the code to it, bring back the results.
Both have code and markdown “chunks” (cells).
Jupyter Notebooks
Quarto
Quarto can compile a Jupyter notebook into any of is output formats.
This allows you to quickly turn your Jupyter Notebooks into pdf reports, or website pages, etc.
Instead, you can use Jupyter as the engine for Quarto (instead of R).
rmarkdown RPluto.jl JuliaQuarto Julia, Python, RJupyter Notebooks Julia, Python, RJupyter’s big difference: output is included in the notebook file.
I love to use Quarto (and before that rmarkdown).
But I think your research project should not be in a notebook.
. operator has a multiple meanings
jupyter notebook