Julia

Introduction to Julia

Matthew DeHaven

January 13, 2025

Course Home Page

Julia Overview

  • Similar to R and Python in syntax
  • Very fast
  • Builtin package and environment manager
  • Unicode support
  • Broadcasting

Julia Background

Extremely new language.

Development began in 2009. Version 1.0 was launched in 2018.

Developed by a group at MIT: “Why We Created Julia

Designed to be:

  • As readable as R and Python
  • As fast as C

(C is very fast, but hard to read and write)

Julia Installs

Installing Julia

Julia can be installed with…

Running Julia in VS Code

You can run a whole file of Julia code by clicking the play arrow.

Or you can run a single line or selection…

  • hitting Shift+Enter

Both of these will execute the Julia code in a terminal, called a REPL—read-eval-print-loop.

Hello World Example

x = "Hello Class!"
print(x)
Hello Class!
a = 0.1
b = 0.1
c = 0.1

print(a + b + c == 0.3)
false

Julia Environments and Packages

Pkg.jl

Julia has a built in package and environment manager: Pkg.jl

It has its own REPL, which can be launched from Julia by typing a right square bracket: ]

But all of its commands can also be run from withing Julia.

  • ex: Pkg.add("DataFrames")

I will show commands should be run in the Pkg REPL with: pkg>

Creating an Environment

  • In a Julia REPL, activate the Pkg REPL with: ]

Then you can create and activate an environment:

pkg>
activate myenv

This will create an environment in a new folder, “myenv”.

If you want to use your current folder, run

pkg>
activate .

Adding a Package

In Julia, you can install packages both system wide

  • i.e. in the default installation environment (called something like “v1.10”)

or in an environment you have created.

To add the DataFrames.jl package:

pkg>
add DataFrames

This is equivalent to running the following in Julia:

import Pkg
Pkg.add("DataFrames")

Pkg TOMLs

Once you have a package added to your environment, you will see two files:

  1. “Project.toml”

Lists the packages you have installed for this environment.

  1. “Manifest.toml”

Lists the

  • Julia version
  • Installed package versions
  • Every package dependency version

With these two files, Julia can always recreate your environment.

Connect Environment to VS Code Workspace

In order to have your VS Code workspace automatically use the correct Julia environment, you have to connect it.

  1. Open the Command Palette (Shift+Cmd+P), then search “Julia: Change Current Environment”
  1. Click on “Julia env:” at the bottom left of the VS Code window

After selecting your environment, VS Code will add a “.vscode/settings.json” file to the folder. This file just containes a path to the environment. You don’t need to commmit it with Git.

Using Packages

Once you have a package installed, you can:

  1. Load the entire namespace:
using DataFrames
DataFrame(a = 1:2, b = 3:4)
  1. Import the package and use the . to reference functions
import DataFrames
DataFrames.DataFrame(a = 1:2, b = 3:4)
  1. Import the package and rename it
import DataFrames as DF
DF.DataFrame(a = 1:2, b = 3:4)

Julia Basics

Variable Assignment

Julia uses a single = for assignment

x = 42
42

Just like R, functions can be assigned to new variables:

a = sqrt
a(10)
3.1622776601683795

Basic Math

Math is not that different:

3 + 3
6
3 - 3
0
3 * 3
9
3 / 3
1.0
3 ^ 3 # Power
27
3 % 3  # remainder
0

Logic

Logic operators are written with double && or ||

1 > 2
false
1 < 2
true
1 > 2 && 1 < 2
false
1 > 2 || 1 < 2
true
!false
true

Julia Data Types

  • Booleans
  • Numbers: Integer, Floating
  • Strings
  • Collections:
    • Arrays, Tuples

Making Arrays

A basic array is constructed as comma-separated elements in square brackets

my_array = [1, 5, 2, 8]
4-element Vector{Int64}:
 1
 5
 2
 8

Single dimension arrays are referred to as Vectors.

Array Types

Arrays will default to the most general type at construction.

int_array = [1, 5, 2, 8]
4-element Vector{Int64}:
 1
 5
 2
 8

If even one of the elements is a float at construction:

float_array = [1, 5.2, 2, 8]
4-element Vector{Float64}:
 1.0
 5.2
 2.0
 8.0

We cannot add a float to an already made Int Array:

int_array[2] = 9.1
LoadError: InexactError: Int64(9.1)
InexactError: Int64(9.1)

Stacktrace:
 [1] Int64
   @ ./float.jl:912 [inlined]
 [2] convert
   @ ./number.jl:7 [inlined]
 [3] setindex!(A::Vector{Int64}, x::Float64, i1::Int64)
   @ Base ./array.jl:1021
 [4] top-level scope
   @ In[20]:1

Arrays of Any Type

But at construction, arrays can be very flexible on the types included:

a = [1, 3.14, "hi", false]
4-element Vector{Any}:
     1
     3.14
      "hi"
 false

Julia will default to a general type “Any” to allow an integer, float, string, and boolean to exist in the same array.

Enforcing an Array Type

You can be explicit about the type for an array by listing it before the square brackets.

int_array = Int[1, 2, 3]
3-element Vector{Int64}:
 1
 2
 3

If we wanted this to be a Float array:

fl_array = Float64[1, 2, 3]
3-element Vector{Float64}:
 1.0
 2.0
 3.0

Empty Arrays

This is most useful when creating an empty array—which defaults to type Any.

empty_array = []
Any[]

Empty array with a type:

string_array = String[]
String[]

Array Dimensions

You can make…

  • arrays of multiple dimensions (2d: matrices, Nd).
m = [[1, 2, 3] [4, 5, 6]]
3×2 Matrix{Int64}:
 1  4
 2  5
 3  6
  • Arrays of arrays:
aa = [[1, 2, 3], [4, 5, 6]]
2-element Vector{Vector{Int64}}:
 [1, 2, 3]
 [4, 5, 6]

Tuples

Tuples are constructed as comma-separated elements in parentheses.

my_tuple = (1, "hi", 2.2, false)
(1, "hi", 2.2, false)

Tuples are immutable; arrays are mutable.

  • i.e. you cannot change values of tuples once created, or add additional elements
my_tuple[1] = 4
LoadError: MethodError: no method matching setindex!(::Tuple{Int64, String, Float64, Bool}, ::Int64, ::Int64)
MethodError: no method matching setindex!(::Tuple{Int64, String, Float64, Bool}, ::Int64, ::Int64)

Stacktrace:
 [1] top-level scope
   @ In[29]:1

Julia Indexing

Julia starts indexing from 1.

x = ["a", "b", "c"]
x[1]
"a"

For Loops

Julia is not space sensitive.

A for loop instead ends with an end keyword

for i in 1:10
  print(i)
end
12345678910

Leaving out the space is fine (though ugly).

for i in 1:10
print(i)
end
12345678910

Julia Functions

Julia functions are defined with the keyword function and end:

function hi()
  print("Hello!")
end

hi()
Hello!

You can define arguments and return values.

function my_function(a, b)
  c = a ^ 2 + b ^ 2
  return c
end

my_function(3, 4)
25

Short Function Declaration

For one line functions, you can define them with:

f(x, y) = x + y * x

f(2, 3)
8

Fun Feature: Unicode Support

Julia has built in unicode support.

Unicode is a text encoding standard that supports characters from all major writing systems and emojis!

print("Hi! 😃")
Hi! 😃

But what’s really fun, is that you can use emojis as variables too…

😃 = 4
💛 = 2


f(😃, 💛)
12

Practical Unicode

What this is actually incredibly convenient for is writing greek letters for equations:

β = 0.99
γ = 1.2

U(C) = β * (1 / γ) * C ^ γ

U(10)
13.075368837804186

It is a lot tidier than writing out “beta” and “gamma”,

Writing out Unicode

In VS Code, to get one of these unicode symbols, you type:

\:smiley: for 😃

Once you start typing \smi… you can use tab to autocomplete.

Similarily,

\beta for β

Vectorization in Julia

Julia can vectorize (broadcast) any function, which is incredibly useful.

  • By default, Julia functions aren’t vectorized:
abs([-7, 2, -5])
MethodError: no method matching abs(::Vector{Int64})

Closest candidates are:
  abs(::Bool)
   @ Base bool.jl:153
  abs(::Pkg.Resolve.VersionWeight)
   @ Pkg /opt/homebrew/Cellar/julia/1.10.2/share/julia/stdlib/v1.10/Pkg/src/Resolve/versionweights.jl:32
  abs(::Missing)
   @ Base missing.jl:101
  ...


Stacktrace:
 [1] top-level scope
   @ In[39]:1
  • To apply the abs function to each element, add the . operator:
abs.([-7, 2, -5])
3-element Vector{Int64}:
 7
 2
 5

Broadcasting Your Own Functions

Broadcasting with the . operator works for any functions:

f(x) = x ^ 2 + 7 - x

f.(1:6)
6-element Vector{Int64}:
  7
  9
 13
 19
 27
 37

Broadcasting with Math Operators

By default, Julia won’t multiply two vectors element by element:

vec1 = [1, 4, 3]
vec2 = [1, 2, 7]

vec1 * vec2
MethodError: no method matching *(::Vector{Int64}, ::Vector{Int64})

Closest candidates are:
  *(::Any, ::Any, ::Any, ::Any...)
   @ Base operators.jl:587
  *(::LinearAlgebra.Adjoint{<:Number, <:AbstractVector}, ::AbstractVector{<:Number})
   @ LinearAlgebra /opt/homebrew/Cellar/julia/1.10.2/share/julia/stdlib/v1.10/LinearAlgebra/src/adjtrans.jl:462
  *(::Union{LinearAlgebra.Adjoint{<:Any, <:StridedMatrix{T}}, LinearAlgebra.Transpose{<:Any, <:StridedMatrix{T}}, StridedMatrix{T}}, ::StridedVector{S}) where {T<:Union{Float32, Float64, ComplexF64, ComplexF32}, S<:Real}
   @ LinearAlgebra /opt/homebrew/Cellar/julia/1.10.2/share/julia/stdlib/v1.10/LinearAlgebra/src/matmul.jl:50
  ...


Stacktrace:
 [1] top-level scope
   @ In[42]:4

Instead, use broadcasting to get this behavior

vec1 .* vec2
3-element Vector{Int64}:
  1
  8
 21

DataFrames

Working with Data

DataFrames is a package that implements dataframes in Julia.

First, you need to install DataFrames, then import it.

pkg>
add DataFrames

And then in your Julia code:

using DataFrames

DataFrame Example

Constructing a dataframe by hand:

df = DataFrame(
  a = [1, 5, 6, 8],
  b = ["Y", "Y", "N", "N"],
  c = [true, false, false, true]
)
4×3 DataFrame
Row a b c
Int64 String Bool
1 1 Y true
2 5 Y false
3 6 N false
4 8 N true

The “Bang!” Convention

In Julia it is convention that functions that modify their inputs always end with a bang: !

For example, if we want to add a row to the DataFrame:

push!(df, [2, "Y", false])
5×3 DataFrame
Row a b c
Int64 String Bool
1 1 Y true
2 5 Y false
3 6 N false
4 8 N true
5 2 Y false

Data Science with DataFrames

All of the data science operations we saw with dplyr and data.table are possible with Julia and DataFrames.jl

  • group by
  • summarize
  • adding new columns
  • etc.

For reading and writing CSVs, use the CSV.jl package.

Summary

Julia Overview

  • Similar to R and Python in syntax
  • Very fast
  • Builtin package and environment manager
  • Unicode support
  • Broadcasting