R Packages

Documenting and storing your functions

Matthew DeHaven

March 12, 2025

Course Home Page

Lecture Outline

  • Storing Functions
    • Single file
    • Multiple files
    • In a package
  • Packages
    • Development
    • Sharing

Storing Functions

Function Review

We can declare new functions in R with the following structure:

myfunc <- function(a, b) {

  ## Do something
  result <- a + b * a
  
  return(result)
}

Functions Review

We saw last class

  • How to write functions
  • How to apply functions to vectors or lists using purrr::map()
  • How to parallelize functions using furrr::future_map()

At the end of class, we briefly saw how to source() a file of functions.

Sourcing Code in R

We can always split one R file into multiple files.

This is a convenient way to structure your code and keep it organized.

We can then use source() to load the code from the other files.

A Possible Folder Structure

Here is a possible folder structure for a project:

Project
|-- Readme.md
|-- main.R
|-- renv.lock
|-- code
|   |-- clean_data.R
|   |-- analysis.R
|   `-- helpers.R
|-- raw_data
`-- output

A Possible Folder Structure

Our “main.R” script could then look something like this:

Project
|-- Readme.md
|-- main.R
|-- renv.lock
|-- code
|   |-- clean_data.R
|   |-- analysis.R
|   `-- helpers.R
|-- raw_data
`-- output
main.R
renv::restore()

## Load custom helper functions
source("code/helpers.R")

source("code/clean_data.R")

source("code/analysis.R")

This is a good way to keep your code organized and easy to read.

“helpers.R” File

Our “helpers.R” file could look something like this:

helpers.R
## This is a file with all of our custom helper functions

## A function that does x and y
myfunc1 <- function(a, b) {
  result <- a + b * a
  return(result)
}

## Another function that does z
myfunc2 <- function(a, b) {
  result <- a * b
  return(result)
}

## A third function that does w
myfunc3 <- function(a, b) {
  result <- a / b
  return(result)
}

Problems with a Single Helper File

If you have a lot of custom functions, the helpers.R file can get really long.

This makes finding and editing functions harder.

And it makes your code less readable.

A possible solution:

  • Split your functions into multiple files.

A Folder of Functions

Now we can split each of our helper functions into their own files.

Project
|-- Readme.md
|-- main.R
|-- renv.lock
|-- code
|   |-- clean_data.R
|   |-- analysis.R
|   `-- helpers
|       |-- myfunc1.R
|       |-- myfunc2.R
|       `-- myfunc3.R
|-- raw_data
`-- output
main.R
renv::restore()

## Load custom helper functions
source("code/helpers.R")

source("code/clean_data.R")

source("code/analysis.R")

But how do we load all of these scripts into “main.R”?

A Folder of Functions

This is easy with list.files() and lapply().

Project
|-- Readme.md
|-- main.R
|-- renv.lock
|-- code
|   |-- clean_data.R
|   |-- analysis.R
|   `-- helpers
|       |-- myfunc1.R
|       |-- myfunc2.R
|       `-- myfunc3.R
|-- raw_data
`-- output
main.R
renv::restore()

## Load all custom helper functions
lapply(list.files("code/helpers", full.names = TRUE), source)

source("code/clean_data.R")

source("code/analysis.R")

list.files() returns a list of all files at a location. It has options for filtering for specific file types and recursively listing files in subfolders.

Live Coding Example

  • Set up the folder strcture
  • See how source() works
  • See how list.files() works

Packages

What are Packages?

We have used many different R packages throughout the course.

Packages provide

  • R functions
  • documentation for those functions
  • possibly some sample data

We have seen how you can install packages remotely from CRAN using renv::install() or install.packages().

Other Package Sources

You can also install packages that are stored either

  • on Github
  • or locally, on your machine.

This can be great for installing packages in development, or for packages you write.

Writing Your Own Package

Why would you want to write a package?

  • additional documentation
  • usable across multiple projects
  • sharable with others.

The Best Resource for Writing R Packages

If you want to write a package, read R Packages (2e) by Hadley Wickham and Jennifer Bryan.

It starts with a simple example package.

Then goes into detail about every element of an R package.

I will try to give you the highlights today, but we won’t cover everything.

Packages

  1. Package Development
  2. Using Your Package
  3. Sharing Your Package

We will be developing the package in a different folder/project than the project where we install/load the package.

Package Development: devtools

First, we will install a package devtools.

renv::install("devtools")

This package has a bunch of useful “tools” for “developing” packages.

It will also install a package, usethis, which helps create some templates for us.

Package Development: Create a Package

First, navigate to where you would like your package to live.

For creating a subfolder package,

usethis::create_package("packageName")

or for creating a package for your current folder/repository,

usethis::create_package(".")

"." is a filepath that means “here”, which is your current working directory (folder).

Package Development: Default Files

The create_package() function will create some folders and files needed to structure a package.

  • .Rbuildignore lists files to ignore when building the package
  • .gitignore
  • DESCRIPTION file for metadata about your package
  • NAMESPACE file listing dependencies and functions exported
  • R/ a folder where we will put all of our functions
  • packageName.Rproj for Rstudio projects, we won’t use it

Package Development: Adding Functions

To add a function to our package, we simply declare it in the R/ folder in a new R script.

myfunc.R
myfunc <- function(a, b) {
  result <- a + b * a
  return(result)
}

In general, you should have one .R file for each function, or at least for each family of very similar functions.

Package Development: Loading the Package

As you are writing your functions, you will want to load and test your package.

You can do that by loading the package with devtools::load_all() in the R terminal.

Then you can run and test your functions.

Later, we will look at installing and loading your package into a separate project, but this load_all() is for testing during development.

Package Development: roxygen2

roxygen2 takes the comments we write just before our function and translates them into documentation files for us.

This is a lot easier than writing the documentation files ourselves.

After we’ve added the comments to myfunc.R, we simply call

devtools::document()

Package Development: Documenting Functions

crra.R
#' CRRA Utlity
#' 
#' Function to calculate constant
#' relative risk aversion utility.
#'
#' @param c consumption
#' @param gamma relative risk aversion.
#' @param beta discount factor.
#' Default is 0.99.
#'
#' @return A numerical vector
#' @export
#'
#' @examples
#' crra(1:10, 0.5)
crra <- function(c, gamma, beta = 0.99) {
  utility <-
    beta * c ^ (1 - gamma) / (1 - gamma)
  return(utility)
}

Here is an example of roxygen2 style documentation.

We will look at each part separately.

Package Development: Documenting Title

crra.R
#' CRRA Utlity
#' 
#' Function to calculate constant
#' relative risk aversion utility.
#'
#' @param c consumption
#' @param gamma relative risk aversion.
#' @param beta discount factor.
#' Default is 0.99.
#'
#' @return A numerical vector
#' @export
#'
#' @examples
#' crra(1:10, 0.5)
crra <- function(c, gamma, beta = 0.99) {
  utility <-
    beta * c ^ (1 - gamma) / (1 - gamma)
  return(utility)
}

This is the title of the function. It should be one line and short.

Package Development: Documenting Description

crra.R
#' CRRA Utlity
#' 
#' Function to calculate constant
#' relative risk aversion utility.
#'
#' @param c consumption
#' @param gamma relative risk aversion.
#' @param beta discount factor.
#' Default is 0.99.
#'
#' @return A numerical vector
#' @export
#'
#' @examples
#' crra(1:10, 0.5)
crra <- function(c, gamma, beta = 0.99) {
  utility <-
    beta * c ^ (1 - gamma) / (1 - gamma)
  return(utility)
}

This is the description of the function. It can be one line, or many.

Package Development: Documenting Paramaters

crra.R
#' CRRA Utlity
#' 
#' Function to calculate constant
#' relative risk aversion utility.
#'
#' @param c consumption
#' @param gamma relative risk aversion.
#' @param beta discount factor.
#' Default is 0.99.
#'
#' @return A numerical vector
#' @export
#'
#' @examples
#' crra(1:10, 0.5)
crra <- function(c, gamma, beta = 0.99) {
  utility <-
    beta * c ^ (1 - gamma) / (1 - gamma)
  return(utility)
}

Documention for each parameter. Can be one line or many.

Often used to note expected input types (data.frame, numeric, etc.).

Package Development: Documenting Return Values

crra.R
#' CRRA Utlity
#' 
#' Function to calculate constant
#' relative risk aversion utility.
#'
#' @param c consumption
#' @param gamma relative risk aversion.
#' @param beta discount factor.
#' Default is 0.99.
#'
#' @return A numerical vector
#' @export
#'
#' @examples
#' crra(1:10, 0.5)
crra <- function(c, gamma, beta = 0.99) {
  utility <-
    beta * c ^ (1 - gamma) / (1 - gamma)
  return(utility)
}

Description of the value returned.

Usually notes the data type of the object.

Package Development: Documenting Export

crra.R
#' CRRA Utlity
#' 
#' Function to calculate constant
#' relative risk aversion utility.
#'
#' @param c consumption
#' @param gamma relative risk aversion.
#' @param beta discount factor.
#' Default is 0.99.
#'
#' @return A numerical vector
#' @export
#'
#' @examples
#' crra(1:10, 0.5)
crra <- function(c, gamma, beta = 0.99) {
  utility <-
    beta * c ^ (1 - gamma) / (1 - gamma)
  return(utility)
}

This line tells roxygen2 to export the function to the package’s namespace.

If not included, this function will not be available when the package is installed and loaded.

Some functions are not exported because they are only used internally.

Package Development: Documenting Examples

crra.R
#' CRRA Utlity
#' 
#' Function to calculate constant
#' relative risk aversion utility.
#'
#' @param c consumption
#' @param gamma relative risk aversion.
#' @param beta discount factor.
#' Default is 0.99.
#'
#' @return A numerical vector
#' @export
#'
#' @examples
#' crra(1:10, 0.5)
crra <- function(c, gamma, beta = 0.99) {
  utility <-
    beta * c ^ (1 - gamma) / (1 - gamma)
  return(utility)
}

Example use of the code. Make sure it works when run!

Can have multiple examples.

Package Development: Adding Metadata

Especially if you are publishing your package for others, you will want to add some metadata for your package.

  • Edit the DESCRIPTION file

You can add things like,

  • description of what the package does
  • authors
  • package title

To add a license, call devtools::use_mit_license() (or a similar license function).

Package Development: Dependencies

If you want to use another package’s functions in your package, first call usethis::use_package("thatPackage") in the terminal.

  • This adds thatPackage to the list of “Imports” in the NAMESPACE file.

Then you can use that package’s functions in your script:

#| filename: "myfunc.R"
myfunc <- function(a, b) {
  result <- a + b * a
  result <- thatPackage::thatFunction(result)
  return(result)
}
  • Always use the explicit function call when writing packages.
    • insures behavior is the same regardless of package loading order.

Package Development: check()

CRAN has a lot of standards that packages have to pass.

You can (and you should) check to see if your package passes by calling devtools::check() in the terminal.

It is good to check your package early and often.

Even if you do not plan to submit to CRAN, they are good standards to follow.

Package Development: Unit Tests Preview

Once you start writing packages, you should be writing “unit tests” for each function.

Unit Tests

  • test the behavior of a function by
    • providing specific input to a function
    • comparing the output to expected output

We will have a whole lecture later about writing unit tests and how they can be helpful for all programming projects.

Package Development: GitHub Actions Preview

We will also have a lecture on “GitHub Actions” as a form of “Continuous Integration”.

GitHub Actions can…

  • automatically run your unit tests whenever you push to GitHub
  • automatically run check() whenever you push to GitHub
  • automatically run any code you want when yoou push to GitHub

Packages

  1. Package Development
  2. Using Your Package
  3. Sharing Your Package

We will be developing the package in a different folder/project than the project where we install/load the package.

Accessing Our Package

We can install our package using renv.

renv::install("path-to-my-package/packageName")
library(packageName)

Using Our New Function

We just loaded our own package and our function crra()!

We can now use it in all of our scripts.

?crra

And we should be able to see the documentation we wrote!

Sharing Your Package

You can easily share your custom package with GitHub.

  • make your package a repository on GitHub.

Then, you (and others) can install your package using

renv::install("yourGitHubUsername/packageName")

If you want your package to be available on CRAN, follow this guide (it takes more work and maintanence).

Package Summary

If you write a package, first read R Packages (2e).

Packages are great for

  • documentation
  • flexible code used for multiple projects
  • sharing code with others

What I use Packages For

  1. “Helper” Package for most projects
    • easy to document and test functions
    • easier for coauthors to understand what my functions do
  2. “Analysis” Packages
    • for code that is flexible and reusable
    • i.e. a VAR identification package
  3. “Data” Packages

Summary

Lecture Summary

  • Storing Functions
    • Single file
    • Multiple files
    • In a package
  • Packages
    • Development
    • Sharing

Live Coding Example

  • Create a package locally
  • Add the crra() function
  • Add documentation
  • Run document() and check()
  • Install and load the package in another project

Coding Exercise

  • Create a temporary project
  • Setup the folder structure shown in these slides
  • Write a function helpers.R
    • source() the function and use it in main.R
  • Change to a helpers/ folder, write two separate function scripts
    • Use list.files() and lapply() to source() all of the functions