This post was featured on the R Weekly highlights podcast hosted by Eric Nantz and Mike Thomas.
I’m currently refactoring test files in a package. Beside some automatic refactoring, I am also manually updating lines of code. Here are some tips (or pet peeves, based on how I look at it / how tired I am 😁)
Prequel: please read the R packages book
The new edition of the R Packages book by Hadley Wickham and Jenny Bryan features three chapters on testing, all well worth a read. The “High-level principles for testing” in particular are good to keep in mind.
Use informative variable or function names
If your test is
test_that("function_maelle_does_not_know_well() works", {
q <- function_maelle_does_not_know_well(1:10)
ab <- function(x) {
y <- exp(x) + 42 - 7
round(log(y / 50))
}
expect_equal(q, ab(6))
})
you might want to rename q
and ab
to something more informative. Note that this is a fictional example, it does not make any mathematical sense.
test_that("function_maelle_does_not_know_well() works", {
transformed_vector <- function_maelle_does_not_know_well(1:10)
crude_transformation_approximation <- function(x) {
y <- exp(x) + 42 - 7
round(log(y / 50))
}
expect_equal(transformed_vector, crude_transformation_approximation(6))
})
Imagine it will be me reading your test file, and I know nothing about your use case. More seriously, that’s one of the numerous aspects code review can help with.
Vary variable names within tests
To prevent copy-pasting mistakes and reading fatigue,
test_that("awesome_function() works", {
x <- awesome_function(1:10, method = "easy")
expect_typeof(x, "double")
expect_gt(x, 5)
x <- awesome_function(1:10, method = "medium")
expect_typeof(x, "double")
expect_gt(x, 15)
x <- awesome_function(1:10, method = "hard")
expect_typeof(x, "double")
expect_gt(x, 25)
})
is in my opinion better as
test_that("awesome_function() works", {
easy_output <- awesome_function(1:10, method = "easy")
expect_typeof(easy_output, "double")
expect_gt(easy_output, 5)
medium_output <- awesome_function(1:10, method = "medium")
expect_typeof(medium_output, "double")
expect_gt(medium_output, 15)
hard_output <- awesome_function(1:10, method = "hard")
expect_typeof(hard_output, "double")
expect_gt(hard_output, 25)
})
I do not hate it when several tests use the same variable names though.
Put the object as close as possible to the related expectation(s)
Imagine I made the code chunk from above
test_that("awesome_function() works", {
easy_output <- awesome_function(1:10, method = "easy")
medium_output <- awesome_function(1:10, method = "medium")
hard_output <- awesome_function(1:10, method = "hard")
expect_typeof(easy_output, "double")
expect_gt(easy_output, 5)
expect_typeof(medium_output, "double")
expect_gt(medium_output, 15)
expect_typeof(hard_output, "double")
expect_gt(hard_output, 25)
})
I much prefer the version
test_that("awesome_function() works", {
easy_output <- awesome_function(1:10, method = "easy")
expect_typeof(easy_output, "double")
expect_gt(easy_output, 5)
medium_output <- awesome_function(1:10, method = "medium")
expect_typeof(medium_output, "double")
expect_gt(medium_output, 15)
hard_output <- awesome_function(1:10, method = "hard")
expect_typeof(hard_output, "double")
expect_gt(hard_output, 25)
})
because reading the expectations doesn’t require me to go too far back or to remember too much.
In some cases, if the object creation is simple, and it is used in one expectation only, why not put it inside the expectation rather than having to assign it to an object with an informative name?
the_log <- log(x)
expect_gt(the_log, 1)
vs.
expect_gt(log(x), 1)
Split the tests as needed, navigate them seamlessly
Splitting the tests (into several test_that()
) means each one is more focused, easier to read, easier to investigate when failing (tests tend to fail).
In RStudio IDE, each test_that()
calls is shown in the script outline (the table of contents on the right) so you can hop from one to the other at will.
Use a bug summary, not number, as test name
Instead of
test_that("Bug #666 is fixed", {
...
})
I prefer
test_that("awesome_function() does not remove the ordering", {
# <link-to-bug-report>
...
})
because:
- I can understand what the test is about without clicking the link;
- If your repository moved for instance from a monorepo to a regular repository, the hyperlink added by RStudio IDE might no longer be valid.
Conclusion
In this post I summarized my testing preferences du jour. Do you have any tips for refactoring test scripts?