Cover and modify, some tips for R package development
I’ve recently been dealing with legacy code refactoring both in theory and in practice: while I’m continuing some work on the igraph R package, I’ve started reading Working Effectively with Legacy Code by Michael Feathers, that had been in my to-read pile for months. In this post, I’ll summarize some ideas from both the book and my work. “Cover and modify” with “characterization tests” When you start modifying your rusty code, how do you ensure you do not break existing and important behaviour inadvertently?
Create and use a custom roxygen2 tag
This post was featured on the R Weekly highlights podcast hosted by Eric Nantz and Mike Thomas. You might know that it’s possible to extend roxygen2 to do all sorts of cool things including but not limited to: documenting your internal functions for developers only (that’s devtag by my cynkra colleague Antoine Fabri), recording your following statistical software standards (that’s srr by my rOpenSci colleague Mark Padgham), writing tests from within R scripts (that’s roxytest by Mikkel Meyer Andersen).
Does my test really validate a bug fix? Check it with git cherry-pick
Earlier this year I wrote a post about git worktree that allows you to load different versions of an R package at once on your computer. To keep with the “juggle between versions of a codebase with Git plant-related commands” theme, let me show you how I use cherry-pick to assess the quality of an unit test. Scenario: you fix a bug in a branch In a perfect world, the bug you’re working on is paired with a ✨ reprex ✨, and you even start your work by adding a test case for it.
Get your codebase lint-free forever with lintr
This post was featured on the R Weekly highlights podcast hosted by Eric Nantz and Mike Thomas. Writing good code is hard. Some aspects get easier with experience – although I observe that I consistently forget some things. 🙈 Other aspects can be tackled through code review – although your reviewer’s time will be better spent on design questions than on nitpicks. 💅 Static code analysis can help with code quality.
The current introduction to my package development workshops
I somewhat regularly teach about package development. One recent example was a workshop for rOpenSci champions. I am improving my teaching over time (thankfully 😅) but one thing I have down by now is the intro, which is mostly my throwing together my favorite quotes about R package development! Let me write it up. Where I explain why people shouldn’t flee the workshop After boasting a bit (a.k.a. sharing my package development creds to introduce myself), I answer three retorical questions:
The real reset button for local mess fom tests: withr::deferred_run()
This post was featured on the R Weekly highlights podcast hosted by Eric Nantz and Mike Thomas. Following last week’s post on my testing workflow enhancements, Jenny Bryan kindly reminded me of the existence of an actual reset button when you’ve been interactively running tests that include some “local mess”: withr::local_envvar(), withr::local_dir(), usethis::local_project()… The reset button is withr::deferred_run(). It is documented in Jenny’s article about test fixtures: Since the global environment isn’t perishable, like a test environment is, you have to call deferred_run() explicitly to execute the deferred events.
Two recent enhancements to my testing workflow
I spend a lot of quality time with testthat, that sometimes deigns to praise my code with emojis, sometimes has to encourage me. No one gets it right on their first try apparently? Anyway, in honor of testthat 3.2.0 release 🎉 👏, I’d like to mention two small things that improved my testing workflow a whole lot! Running one single test at a time Under testthat 3.2.0 minor features lies a small gem:
How to become a better R code detective?
Huge thanks to Hannah Frick for her useful feedback on this post! Vielen Dank! This post was featured on the R Weekly podcast by Eric Nantz. When trying to fix a bug or add a feature to an R package, how do you go from viewing the code as a big messy ball of wool, to a logical diagram that you can bend to your will? In this post, I will share some resources and tips on getting better at debugging and reading code, written by someone else (or yourself but long enough ago to feel foreign!
Lintr Bot, lintr's Hester egg
Remember my blog post about automatic tools for improving R packages? One of these tools is Jim Hester’s lintr
, a package that performs static code analysis. In my experience it mostly helps identifying too long code lines and missing space, although it’s a bit more involved than that. In any case, lintr
helps you maintain good code style, and as mentioned in that now old post of mine, you can add a lintr
unit test to your package which will ensure you don’t get lazy over time.
Now say your package has a lintr
unit test and lives on GitHub. What happens if someone makes a pull request and writes looong code lines? Continuous integration builds will fail but not only that… The contributor will get to know Lintr Bot, lintr’s Hester (Easter) egg!
How to develop good R packages (for open science)
I was invited to an exciting ecology & R hackathon in my capacity as a co-editor for rOpenSci onboarding system of packages. It also worked well geographically since this hackathon was to take place in Ghent (Belgium) which is not too far away from my new city, Nancy (France). The idea was to have me talk about my “top tips on how to design and develop high-quality, user-friendly R software” in the context of open science, and then be a facilitator at the hackathon.
The talk topic sounded a bit daunting but as soon as I started preparing the talk I got all excited gathering resources – and as you may imagine since I was asked to talk about my tips I did not need to try & be 100% exhaustive. I was not starting from scratch obviously: we at rOpenSci already have well-formed opinions about such software, and I had given a talk about automatic tools for package improvement whose content was part of my top tips.
As I’ve done in the past with my talks, I chose to increase the impact/accessibility of my work by sharing it on this blog. I’ll also share this post on the day of the hackathon to provide my audience with a more structured document than my slides, in case they want to keep some trace of what I said (and it helped me build a good narrative for the talk!). Most of these tips will be useful for package development in general, and a few of them specific to scientific software.
What's in our internal chaimagic package at work
At my day job I’m a data manager and statistician for an epidemiology project called CHAI lead by Cathryn Tonne. CHAI means “Cardio-vascular health effects of air pollution in Telangana, India” and you can find more about it in our recently published protocol paper . At my institute you could also find the PASTA and TAPAS projects so apparently epidemiologists are good at naming things, or obsessed with food… But back to CHAI! This week Sean Lopp from RStudio wrote an interesting blog post about internal packages. I liked reading it and feeling good because we do have an internal R package for CHAI! In this blog post, I’ll explain what’s in there, in the hope of maybe providing inspiration for your own internal package!
As posted in this tweet, this pic represents the Barcelona contingent of CHAI, a really nice group to work with! We have other colleagues in India obviously, but also in the US.
How I became a crolute i.e. an user of the crul package
A few months ago rOpenSci’s Scott Chamberlain asked me for feedback about a new package of his called crul
, an http client like httr
, so basically something you use for e.g. writing a package interfacing an API. He told me that a great thing about crul
was that it supports asynchronous requests. I felt utterly uncool because I had no idea what this meant although I had already written quite a few API packages (for instance ropenaq
, riem
and opencage
).
So I googled the concept, my mind was blown and I decided that I’d trust Scott’s skills (spoiler: you can always do that) and just replace the httr
dependency of ropenaq
by crul
. Why? First of all note that Crul is a planet in Star Wars whose male inhabitants are called crolutes which sound quite cool (there are female ones as well, called gilliands which doesn’t sound like the package name) and which I now use as a synonym for “user of the crul
package”. But I had other reasons to switch… that was the subject of my lightning talk today at the French R conference in Anglet. In this blog post I’ll tell the story again, with a bit more details, in the hope to make you curious about crul
!
Pic by ThinkR, thanks Colin/Diane/Vincent!
Automatic tools for improving R packages
On Tuesday I gave a talk at a meetup of the R users group of Barcelona. I got to choose the topic of my talk, and decided I’d like to expand a bit on a recent tweet of mine. There are tools that help you improve your R packages, some of them are not famous enough yet in my opinion, so I was happy to help spread the word! I published my slides online but thought that a blog post would be nice as well.