Gaston Sanchez
- Naming functions
- Technicalities of arguments
- Documenting a function
- Good practices
- Choose meaningful names of functions
- Preferably a verb
- Think about the users (who will use your functions)
- Be consisting with your naming style
Avoid this:
f <- function(x, y) {
x + y
}
This is better
add <- function(x, y) {
x + y
}
Functions can have any number of arguments (even zero arguments)
# function with 2 arguments
add <- function(x, y) {
z <- x + y
return(z)
}
# function with no arguments
hi <- function() {
print("Hi there!")
}
What happens when you call hi()
?
hi()
## [1] "Hi there!"
What happens when you pass an input to hi()
?
# be careful
hi('hello')
## Error in hi("hello"): unused argument ("hello")
Sometimes is better to give default values to arguments:
hey <- function(x = "") {
cat("Hey", x, "\nHow is it going?")
}
hey()
## Hey
## How is it going?
hey("Gaston")
## Hey Gaston
## How is it going?
If you specify an argument with no default value, you must give it a value everytime you call the function, otherwise you’ll get an error:
sqr <- function(x) {
y <- x^2
return(y)
}
# be careful
sqr()
## Error in sqr(): argument "x" is missing, with no default
Sometimes you don’t want to give default values, but you also don’t want
to cause an error. We can use missing()
to see if an argument is
missing:
abc <- function(a, b, c = 3) {
if (missing(b)) {
result <- a * 2 + c
} else {
result <- a * b + c
}
return(result)
}
You can also set an argument value to NULL if you don’t want to specify a default value:
abcd <- function(a, b = 2, c = 3, d = NULL) {
if (is.null(d)) {
result <- a * b + c
} else {
result <- a * b + c * d
}
return(result)
}
Notice that the function abcd()
can be written as:
abcd <- function(a, b = 2, c = 3, d = NULL) {
if (is.null(d)) {
return(a * b + c)
} else {
return(a * b + c * d)
}
}
Consider the following plotting function myplot()
:
# arguments with and without default values
myplot <- function(x, y, col = "#3488ff", pch = 19) {
plot(x, y, col = col, pch = pch)
}
myplot(1:5, 1:5)
myplot()
has four argumentsx
andy
have no default valuescol
andpch
have default values (but they can be changed)
# changing default values
myplot <- function(x, y, col = "#4286f4", pch = 20) {
plot(x, y, col = col, pch = pch)
}
myplot(1:5, 1:5)
There are various kinds of arguments that can be classified in two main groups:
- arguments with default values are known as named arguments.
- arguments with no default values are referred to as positional arguments.
Here’s an example of a function containing both types of arguments:
omg <- function(pos1, pos2, name1 = 1, name2 = 2) {
(pos1 + name1) * (pos2 + name2)
}
omg()
has four argumentspos1
is a positional argumentpos2
is a positional argumentname1
is a named argumentname2
is a named argument
Arguments can be matched positionally or by name
values <- seq(-2, 1, length.out = 20)
# equivalent calls
mean(values)
mean(x = values)
mean(x = values, na.rm = FALSE)
mean(na.rm = FALSE, x = values)
mean(na.rm = FALSE, values)
Named arguments can also be partially matched:
# equivalent calls
seq(from = 1, to = 2, length.out = 5)
seq(from = 1, to = 2, length = 5)
seq(from = 1, to = 2, len = 5)
length.out
is partially matched with length
and len
mean(c(NA, 1:9), na.rm = TRUE)
# saving typing
mean(c(NA, 1:9), na.rm = T)
# saving typing but dangerous
mean(c(NA, 1:9), na = T)
# Generally you don't need to name all arguments
mean(x = c(NA, 1:9), na.rm = TRUE)
# unusual orders best avoided
mean(na.rm = TRUE, x = c(NA, 1:9))
mean(na = T, c(NA, 1:9))
# Don't need to supply defaults
mean(x = c(NA, 1:9), na.rm = FALSE)
# Need to remember too much about mean()
mean(x = c(NA, 1:9), , TRUE)
# Don't abbreviate too much
mean(c(NA, 1:9), n = T)
f <- function(a = 1, abcd = 1, abdd = 1) {
print(a)
print(abcd)
print(abdd)
}
# what will happen?
f(a = 5)
f(ab = 5)
## Error in f(ab = 5): argument 1 matches multiple formal arguments
f(abc = 5)
Give meaningful names to arguments:
# Avoid this
area_rect <- function(x, y) {
x * y
}
This is better
area_rect <- function(length, width) {
length * width
}
Even better: give default values (whenever possible)
area_rect <- function(length = 1, width = 1) {
length * width
}
Avoid this:
# what does this function do?
ci <- function(p, r, n, ti) { p * (1 + r/p)^(ti * p)
}
This is better:
compound_interest <-
function(principal, rate, periods, time) {
principal * (1 + rate/periods)^(time * periods)
}
There are two main functions for generating warnings and errors:
stop()
warning()
- There’s also the
stopifnot()
function
Use stop()
to stop the execution (this will raise an error)
meansd <- function(x, na.rm = FALSE) {
if (!is.numeric(x)) {
stop("x is not numeric")
}
# output
c(mean = mean(x, na.rm = na.rm),
sd = sd(x, na.rm = na.rm))
}
Use warning()
to show a warning message
meansd <- function(x, na.rm = FALSE) {
if (!is.numeric(x)) {
warning("non-numeric input coerced to numeric")
x <- as.numeric(x)
}
# output
c(mean = mean(x, na.rm = na.rm),
sd = sd(x, na.rm = na.rm))
}
A warning is useful when you don’t want to stop the execution, but you still want to show potential problems
stopifnot()
ensures the truth of expressions:
meansd <- function(x, na.rm = FALSE) {
stopifnot(is.numeric(x))
# output
c(mean = mean(x, na.rm = na.rm),
sd = sd(x, na.rm = na.rm))
}
meansd('hello')
## Error: is.numeric(x) is not TRUE
So far the examples that you’ve seen in these tutorials are fairly simple. Moreover, they appear in a somewhat raw format. However, you should strive to always include documentation for your functions. What does this mean? Documenting a function involves adding descriptions for the purpose of the function, the inputs it accepts, and the output it produces.
- Description: what the function does
- Input(s): what are the inputs or arguments
- Output: what is the output (returned value)
You can find some inspiration in the help()
documentation you when
search for a given function.
Documentation outside the function
# Description: calculates the area of a rectangle
# Inputs
# length: numeric value
# width: numeric value
# Output
# area value
area_rect <- function(length = 1, width = 1) {
length * width
}
Documentation inside the function’s body
area_rect <- function(length = 1, width = 1) {
# Description: calculates the area of a rectangle
# Inputs
# length: numeric value
# width: numeric value
# Output
# area value
length * width
}
Documentation with roxygen documents (good for packaging purposes)
#' @title Area of Rectangle
#' @description Calculates the area of a rectangle
#' @param length numeric value
#' @param width numeric value
#' @return area (i.e. product of length and width)
#' @examples
#' area_rect()
#' area_rect(length = 5, width = 2)
#' area_rect(width = 2, length = 5)
area_rect <- function(length = 1, width = 1) {
length * width
}
- Don’t write long functions
- Rewrite long functions by converting collections of related expressions into separate functions
- A function often corresponds to a verb of a particular step or task in a sequence of tasks
- Functions form the building blocks for larger tasks
- Write functions so that they can be reused in different settings.
- When writing a function, think about different scenarios and contexts in which it might be used
- Can you generalize it?
- Avoid hard coding values that the user might want to provide. Make them default values of new parameters.
- Make the actions of the function as few as possible, or allow the user to turn off some via logical parameters
- Always test the functions you’ve written
- Even better: let somebody else test them for you
Separate small functions:
- are easier to reason about and manage
- clearly identify what they do
- are easier to test and verify they are correct
- are more likely to be reusable as they each do less and so you can pick the functions that do specific tasks
- Make functions parameterizable
- Allow the user to specify values htat might be computed in the function
- This facilitates testing and avoiding recomputing the same thing in different calls
- Use a default value to do those computations that would be in the body of the function