Remember my blog post about automatic tools for improving R packages? One of these tools is Jim Hester’s lintr
, a package that performs static code analysis. In my experience it mostly helps identifying too long code lines and missing space, although it’s a bit more involved than that. In any case, lintr
helps you maintain good code style, and as mentioned in that now old post of mine, you can add a lintr
unit test to your package which will ensure you don’t get lazy over time.
Now say your package has a lintr
unit test and lives on GitHub. What happens if someone makes a pull request and writes looong code lines? Continuous integration builds will fail but not only that… The contributor will get to know Lintr Bot, lintr’s Hester (Easter) egg!
Recently I needed to count lines of code for a project at work work (this is an expression of the person honored in this post), and happened to discover that Bob Rudis had started an R package wrapping the Perl CLOC script. Of course! He has packages for a lot of things! And he’s always ready to help: after I asked him a question about the package, and made a pull request to renew its wrapped CLOC script, he made it all pretty and ready-to-go!
He himself defined his Stack Overflow Driven-Development (SODD) workflow in a blog post: someone will ask him a question on Stack Overflow, and he’ll write a long answer eventually becoming a package, that will or will not make it to CRAN… Which is the motivation of this blog post. How can I output a list of all packages Bob has on GitHub?
Do you know Lucy? She is a very talented biostatistics PhD candidate that I had the chance to e-meet thanks to R-Ladies. One maybe superficial reason to admire her, on top of her other achievements, is her emoji game in git commits. Looking at Lucy’s git history (find her on Github), one wants to start using version control because she makes it look fun!
In this post, I will download many git commit messages of Lucy’s from Github’s API via the gh
package, and have a look at the emojis she uses the most frequently.
When you do simulations, for instance in R, e.g. drawing samples from a distribution, it’s best to set a random seed via the function set.seed
in order to have reproducible results. The function has no default value. I think I mostly use set.seed(1)
. Last week I received an R script from a colleague in which he used a weird number in set.seed
(maybe a phone number? or maybe he let his fingers type randomly?), which made me curious about the usual seed values. As in my blog post about initial commit messages I used the Github API via the gh
package to get a very rough answer (an answer seedling from the question seed?).
When I create a new .git repository, my first commit message tends to be “1st commit”. I’ve been wondering what other people use as initial commit message. Today I used the gh
package to get first commits of all repositories of the ropensci and ropenscilabs organizations.