Github Actions

Running our code and tests on Github

Matthew DeHaven

March 31, 2024

Course Home Page

Lecture Summary

  • What are Github Actions?
  • Workflows
    • jobs
    • steps
    • events
    • runners

Github Actions

Github Actions

Automate, customize, and execute your software development workflows right in your repository with GitHub Actions. You can discover, create, and share actions to perform any job you’d like, including CI/CD, and combine actions in a completely customized workflow. - Github Action documentation

Github Actions in summary

Allow you to execute code on a remote server hosted by Github.

  • Can be configured to execute on certain events (i.e. whenever you push to Github)
  • Can execute code in your project
  • Or can execute prebuilt “actions” from Github or other parties
  • The servers can run Linux, Windows, or MacOS

Github Actions Tab

There is a tab for Github Actions for every repository.

Github Actions Billing

You are running code on someone else’s server, so there is a limit to how much you can run. Github Action Billing.

But, Github Actions are free for public repositories.

And, you should have 2,000 minutes of run time for private repositories for a free account.

  • You will have more if you sign up for the free Student Developer Pack.

So practically, you can run most things without a worry.

Github Action Compute Power

By default, Github Actions will run on a server with

  • 3-4 CPU cores
  • 16 GB memory
  • 16 GB hard drive storeage

Which is to say, these server resources are not super big. Probably your computer will be faster at everything.

But they also should be big enough to run most projects.

Upgrading Github Action Compute

You can upgrade to Github Actions running on servers that allocate more resources to you.

Up to

  • 64 CPU cores
  • 256 GB memory
  • 2 TB hard drive

But this will start costing actual money to do (Github Larger Runners). You should probably be looking at running your code on Brown’s HPC if you need something close to this size.

Github Actions Overview

Github Actions is a service that allows you to run workflows.

A Workflow Diagram:

from the Github Actions documentation

Events

Events are what trigger a workflow to run.

  • Could be a Git event
    • push to Github
    • pull request, etc.
  • Can be triggered manually
  • Or set to run at specified time intervals (i.e. once a day)

Runners

A runner is a server—hosted by Github—that will run your jobs.

There is always one runner for each job.

They are virtual machines that can have Ubuntu Linux, Microsoft Windows, or macOS operating systems.

They default to a small amount of computing power, but can be upgraded.

Jobs

Jobs are a set of steps to be run.

One job gets assigned to each runner.

Jobs can be run in parallel (default) or in sequence.

A workflow could have one or more jobs.

Steps

Steps are the actual commands given to the runner (the server).

Steps can be either:

  • A shell script (i.e. commands sent to the command line)
  • an action

Steps are where we will say “run this code” or “execute this R script”

Actions

This is not to be confused with Github Actions which is the name of the whole service.

An action is…

  • a custom application that performs a complex and frequently repeated task

Basically, an action performs many steps (kind of like a function call).

Github provides some default actions, and you can use actions written by other users.

Github Actions Overview

Github Actions is a service that allows you to run workflows.

A Workflow Diagram:

from the Github Actions documentation

Example Github Workflow

name: learn-github-actions
run-name: ${{ github.actor }} is learning GitHub Actions
on: [push]
jobs:
  check-bats-version:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '14'
      - run: npm install -g bats
      - run: bats -v

We are going to look at each piece of this in turn.

I am not trying to have you learn how to write one of these. In general, you will be copying and maybe lightly editing ones that already exist. The goal is to be able to read one.

Example Github Workflow

name: learn-github-actions
  • Optional
  • Sets the name of the workflow
  • This is how it will appear in your Github repository’s “Action” tab

Example Github Workflow

run-name: ${{ github.actor }} is learning GitHub Actions
  • Also optional
  • This is the name of the Github Action when it is actually running
  • The ${{ github.actor }} is grabbing your Github username as a variable
  • In general, whenever you see ${{ }}, that is referencing a variable
    • defined by the workflow
    • defined by default by Github (i.e. your username)

Example Github Workflow

on: [push]
  • Event trigger
  • This workflow will trigger whenver there is a “push” to this Github repository

You could instead run the action everyday at 5:30 pm:

on:
  schedule:
    - cron:  '30 5 * * *'

You could also only trigger the workflow when certain files are edited, or when the repository is forked, there are many options.

Example Github Workflow

jobs:
  check-bats-version:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '14'
      - run: npm install -g bats
      - run: bats -v
  • This workflow has only one job
  • So it will be run on only one runner
  • The job is named “check-bats-version”

Example Github Workflow

jobs:
  check-bats-version:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '14'
      - run: npm install -g bats
      - run: bats -v
  • The job runs-on “ubuntu-latest”
  • “ubuntu-latest” = linux
  • “macos-latest” = macos
  • “windows-latest” = windows

You can also choose to run on a specific prior version of an OS.

Example Github Workflow

steps:
  - uses: actions/checkout@v4
  - uses: actions/setup-node@v4
    with:
      node-version: '14'
  - run: npm install -g bats
  - run: bats -v

This job has 4 steps (count the dashes).

Remember, the steps are the actual commands being run on the server.

Example Github Workflow

  - uses: actions/checkout@v4

The uses shows that this is running an action

actions/checout@v4 is a default Github action

  • It checks out the current Github repository
    • i.e. this clones your repository onto the server’s virtual machine
    • Most workflows will start with this

Example Github Workflow

  - uses: actions/setup-node@v4
    with:
      node-version: '14'

The next step is also an action.

actions/checout@v4 is a default Github action

  • It installs and sets up “node.js” which allows you to run javascript
    • the with: command sets an option for the action
    • here, it is specifying the version of “node.js” to install

When runners launch a virtual machine for us, they only have the operating system installed (and some other basics). Anything else we want to use we have to install with an action or install ourselves.

Example Github Workflow

  - run: npm install -g bats
  - run: bats -v

These last two steps are not actions, but just shell commands. You can see that by the run keyword.

npm install -g bats is a command telling “node.js” to install the bats package.

bats -v is a command asking for the version of the package bats.

Example Github Workflow

jobs:
  check-bats-version:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '14'
      - run: npm install -g bats
      - run: bats -v

Now we know how this defines a Github workflow.

Let’s see where we put this in Github.

Setting Up a Github Workflow

Github Workflows live a specific folder in your repository:

“.github/workflows/”

Each workflow is defined by a “yaml” file.

  • “main.yml”
  • “test.yml”

Once you define this folder, and a “yaml” file in it, Github will launch a workflow for you defined by the file.

Setting up a Github Workflow

For your first workflow, it may be easier to define it through Github.

This is how you will do your assignment and also what is detailed in the coding example.

Github Actions for R

Using Github Actions with R

We have seen how Github Actions can run code for us on virtual machines.

How do we get it to run R code?

We first have to tell it to install R, then give it R code to run.

Example Github Action for R

This is a very basic Github workflow that runs print("hello world") in R.

on: [push]
name: Run R code example

jobs:
  run-some-R-code:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - uses: r-lib/actions/setup-r@v2

      - run: Rscript -e 'print("hello world")'

Example Github Action for R

First, we had to install R on the virtual machine.

Thankfully, there are a set of actions defined by the R community that make this easy and fast.

  - uses: r-lib/actions/setup-r@v2

Example Github Action for R

Then, we simply executed our R command.

  • The Rscript -e 'command' allows you to run any one-line R command
  - run: Rscript -e 'print("hello world")'

Example Github Action for R

on: [push]
name: Run R code example

jobs:
  run-some-R-code:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - uses: r-lib/actions/setup-r@v2

      - run: Rscript -e 'print("hello world")'

But what if we want to run more than one line?

Example Github Action for R script

Usually we will have R scripts written in our repository that we want to run.

Let’s assume we have a “main.r” file, we can run it with… . . .

on: [push]
name: Run R code example

jobs:
  runMain:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - uses: r-lib/actions/setup-r@v2

      - name: Run main.R
        run: |
          Rscript main.R

Example Github Action for R script

  - name: Run main.R
    run: |
      Rscript main.R

This is still a single step.

The name: line just names the step and is optional.

The run: line is broken up into multiple lines with the | symbol.

And then an R script can be run by calling Rscript name-of-file.r.

What about Packages?

Remember, these virtual machines come with nothing installed.

Which means we don’t have access to any packages.

A couple of options:

  1. Write a script to install all the packages by calling install.packages() or renv::install()
  1. Use renv to create a lockfile, and then simply run renv::restore()

Option 2 is far better to option 1. In fact, there is a r-lib action that will restore a renv environment for us.

Example Github Action for R + renv

on: [push]
name: Run Main.r

jobs:
  RunMain:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - uses: r-lib/actions/setup-r@v2

      - uses: r-lib/actions/setup-renv@v2

      - name: Run main.r
        run: |
          Rscript main.r

This will restore our environment to whatever state of packages are listed in the lockfile.

It goes beyond that and will cache the packages, so the next time our workflow runs, it’s much faster.

More Pre Built Actions

Pre-built Github Actions for R

  • with all the “r-lib” community actions.

They also have a set of example workflows:

Example Github Action Workflows for R

Continuous Integration (CI)

Continuous integration (CI) is a programing practice / framework.

The idea is that team of developers write their own sections of code separately, but continually integrate their code to a common repository.

  • This common repository then automatically builds and tests the code.

This is in contrast to a system where developers write their code on their own machine, then everyone merges their code together at the end and tries to fix any errors then.

You sometimes will see CI/CD for Continuous Integration/Continuous Deployment. Which adds that the main repository of code is automatically shipped/deployed so customers/other people can use it.

CI with Github Actions

Github allows users to effectively have a CI practice for their code.

If working with multiple users, they can all share a common repository.

And Github Actions can build the code and run tests automatically.

CI for Academic Research Projects

How could CI be useful for us?

  • Run tests automatically
  • “build” your entire research project automatically
    • i.e. run the “main.r” file, build the results from scratch
  • Enforces reproducibility
    • checks multiple operating systems and starts from a machine with nothing installed

Github Actions I Have Used

Here are a few Github Actions I have used for projects:

  • Run devtools::check() on an R package I was writing
  • Run unit tests for a research project
  • Check the coverage of unit tests for package
  • Run “main.r” for a project
  • Compile rmarkdown documents into pdfs
  • Run stylr and lintr which check your code’s style formating
  • Build websites for Github Pages (we will see this later)

Summary

Lecture Summary

  • Github Actions