Learning Objectives

If we only had one data set to analyze, it would probably be faster to load the file into a spreadsheet and use that to plot some simple statistics. But we have twelve files to check, and may have more in the future. In this lesson, we’ll learn how to write a function so that we can repeat several operations with a single command.

Defining a Function

Let’s start by defining a function convert_fahr_to_kelvin that converts temperatures from Fahrenheit to Kelvin:

convert_fahr_to_kelvin <- function(temp) {
  kelvin <- ((temp - 32) * (5 / 9)) + 273.15

We define convert_fahr_to_kelvin by assigning it to the output of function. The list of argument names are containted within parentheses. Next, the body of the function–the statements that are executed when it runs–is contained within curly braces ({}). The statements in the body are indented by two spaces, which makes the code easier to read but does not affect how the code operates.

When we call the function, the values we pass to it are assigned to those variables so that we can use them inside the function. Inside the function, we use a return statement to send a result back to whoever asked for it.


In R, it is not necessary to include the return statement. R automatically returns whichever variable is on the last line of the body of the function. Since we are just learning, we will explicitly define the return statement.

Let’s try running our function. Calling our own function is no different from calling any other function:

# freezing point of water
## [1] 273.15
# boiling point of water
## [1] 373.15

We’ve successfully called the function that we defined, and we have access to the value that we returned.

Composing Functions

Now that we’ve seen how to turn Fahrenheit into Kelvin, it’s easy to turn Kelvin into Celsius:

convert_kelvin_to_celsius <- function(temp) {
  celsius <- temp - 273.15

#absolute zero in Celsius
## [1] -273.15

What about converting Fahrenheit to Celsius? We could write out the formula, but we don’t need to. Instead, we can compose the two functions we have already created:

fahr_to_celsius <- function(temp) {
  temp_k <- convert_fahr_to_kelvin(temp)
  result <- convert_kelvin_to_celsius(temp_k)

# freezing point of water in Celsius
x <- fahr_to_celsius(32.0)

This is our first taste of how larger programs are built: we define basic operations, then combine them in ever-large chunks to get the effect we want. Real-life functions will usually be larger than the ones shown here–typically half a dozen to a few dozen lines–but they shouldn’t ever be much longer than that, or the next person who reads it won’t be able to understand what’s going on.

Challenge - Create a function

  • In the last lesson, we learned to concatenate elements into a vector using the c function, e.g. x <- c("A", "B", "C") creates a vector x with three elements. Furthermore, we can extend that vector again using c, e.g. y <- c(x, "D") creates a vector y with four elements. Write a function called fence that takes two vectors as arguments, called original and wrapper, and returns a new vector that has the wrapper vector at the beginning and end of the original:
best_practice <- c("Write", "programs", "for", "people", "not",
asterisk <- "***" 

# R interprets a variable with a
# single value as a vector  with one element.  fence(best_practice,
# asterisk) 
  • If the variable v refers to a vector, then v[1] is the vector’s first element and v[length(v)] is its last (the function length returns the number of elements in a vector). Write a function called outside that returns a vector made up of just the first and last elements of its input:
dry_principle <- c("Don't", "repeat", "yourself", "or", "others")
## [1] "Don't"  "others"

The Call Stack

Let’s take a closer look at what happens when we call fahr_to_celsius(32). To make things clearer, we’ll start by putting the initial value 32 in a variable and store the final result in one as well:

original <- 32
final <- fahr_to_celsius(original)

The diagram below shows what memory looks like after the first line has been executed:

Call Stack (Initial State)

When we call fahr_to_celsius, R doesn’t create the variable temp right away. Instead, it creates something called a stack frame to keep track of the variables defined by convert_fahr_to_kelvin. Initially, this stack frame only holds the value of temp:

Call Stack Immediately
After First Function Call

When we call convert_fahr_to_kelvin inside fahr_to_celsius, R creates another stack frame to hold convert_fahr_to_kelvin’s variables:

Call Stack During First
Nested Function Call

It does this because there are now two variables in play called temp: the argument to fahr_to_celsius, and the argument to convert_fahr_to_kelvin. Having two variables with the same name in the same part of the program would be ambiguous, so R (and every other modern programming language) creates a new stack frame for each function call to keep that function’s variables separate from those defined by other functions.

When the call to convert_fahr_to_kelvin returns a value, R throws away convert_fahr_to_kelvin’s stack frame and creates a new variable in the stack frame for fahr_to_celsius to hold the temperature in Kelvin:

Call Stack After Return
From First Nested Function Call

It then calls convert_kelvin_to_celsius, which means it creates a stack frame to hold that function’s variables:

Call Stack During Call to
Second Nested Function

Once again, R throws away that stack frame when convert_kelvin_to_celsius is done and creates the variable result in the stack frame for convert_fahr_to_celsius: