Writing your functions

Maybe you want to automatize a process for which you cannot find an already written function, or maybe you want to customize a function to your needs. R enables you to write your own functions using function() and defining the set of steps that will be performed by this function. Of course, they can be stored in objects, which will then be treated as the names of your functions.

Let’s create a function to calculate the median of a distribution. The median is a value that separates the distribution in half. In a distribution with an odd number of elements, the element in the center of this distribution (when sorted from minimum to maximum or the opposite) is the median.

odd.dist <- c(1, 3, 5, 7, 9)
length(odd.dist)/2
[1] 2.5
# because this is not an integer, we will round it upwards
odd.dist[ceiling(length(odd.dist)/2)]
[1] 5

This becomes a bit more difficult when the distribution has an even number of elements, because there is no single position at the middle. What is done then is to calculate the average between the values in the two positions before and after the center of the distribution.

even.dist <- c(2, 4, 6, 8)
median(even.dist)  
[1] 5

Five is not a number in the even.dist, but is the mean between 4 and 6. Thus, a function to calculate the mean will require four steps: 1. order the sequence 2. check whether the sequence is even or odd 3a. if it is odd, select the number in the center position 3b. if it is even, calculate the mean between the numbers before and after the center position

It is always useful to check if each step in your code successfully runs by itself. Let’s do this:

# Step 1: Order the sequence
new.seq <- c(1, 5, 3, 9, 7, 11)
sort(new.seq)
[1]  1  3  5  7  9 11
# Step 2: Odd or even?
length(even.dist)%%2 == 0  ## dividing even.dist by 2 leaves nothing
[1] TRUE
length(odd.dist)%%2 == 0
[1] FALSE
# Step 3a: if odd, select the number in the center position
odd.dist[length(odd.dist)/2]
[1] 3
# Step 3b: if even, calculate the mean between the numbers before 
## and after the center position
bef <- even.dist[length(even.dist)/2]
aft <- even.dist[1 + length(even.dist)/2]
(bef + aft)/2
[1] 5

Now we can put all those steps together in a single function and use it to calculate the median of the new.seq and the even.dist objects.

calculate.median <- 
  function(x) {
    odd.even <- length(x)%%2
    ## we are going to use if/else statements
    if (odd.even == 0) {  
      (sort(x)[length(x)/2] + sort(x)[1 + length(x)/2])/2
    }
    else sort(x)[ceiling(length(x)/2)]
  }

calculate.median(new.seq)
[1] 6
calculate.median(even.dist)
[1] 5

Finally, you might want to save this function so that you can load it in another script, without the need to rewriting it all over. To save and load a function (or any object, actually), use:

save(calculate.median, file = "calculate.median.R")

# and to load it in another session
load(file = "calculate.median.R")