Maëlle's R blog

Not a fish

Pets or livestock? Naming your RMarkdown chunks

Today I made a confession on Twitter: I told the world I had spent my whole career not naming chunks in RMarkdown documents. Even if I had said one should name them when teaching RMarkdown. But it was also a tweet for showing off since I was working on the first manuscript with named chunks and loving it. I got some interesting reactions to my tweet, including one that made me feel better about myself (sorry Thomas), and other ones that made me feel like phrasing why one should name RMarkdown chunks.

Hadley Wickham asked whether chunks were pets or livestock, as in his analogy for models. Livestock chunks are identified by numbers, not names, in the case of chunks defined by position. I now think we have good reason to consider them as pets and here’s why…

Who are the Swedish radio P1 summer guests? Answer via Wikidata

This week, a very promising new R blog was launched, namely the blog of Eric Persson a.k.a as expersso on Twitter. I had really been looking forward to this because expersso’s code screenshots have always been quite cool, so seeing his no longer being limited to them is awesome! His first articles series is about a game, you should really check it out. (PSA: if you post screenshots of R code on Twitter, have a look at Sean Kross’ codefinch package!).

Because I’m a nosy person I asked Eric whether he was Swedish, his last name being quite Swedish-looking in my opinion. He is, which made me wonder about Swedish blog topics and actually decided to use one Swedish topic I came up with, the summer guests of the Swedish radio P1! Every summer since 1959, P1 selects a bunch of famous or interesting people and have them record a bit more than one hour program where they’re free to discuss what they want (important events of their life for instance) and to choose the musical breaks (which you don’t get to listen entirely to in the online version because of copyright stuff). The program is then broadcasted in the summer, one guest a day from the end of June to the beginning of August. There’s even a winter version now but I’ll ignore it because it’s too hot here in Barcelona to even think about winter.

It’s a very cool radio program in my opinion. I discovered it at the end of my 5-month research stay in Gothenburg in 2010 and decided it’d be one way to keep my Swedish skills up to date (my other methods include listening to ABBA in Swedish and reading Camilla Läckberg’s novels). I haven’t listened to that many guests but I really enjoy it when I do, and I like how diverse the list of guests is. In this post, I’ll actually try to have a look at the occupations of the guests via Wikidata!

What's in our internal chaimagic package at work

At my day job I’m a data manager and statistician for an epidemiology project called CHAI lead by Cathryn Tonne. CHAI means “Cardio-vascular health effects of air pollution in Telangana, India” and you can find more about it in our recently published protocol paper . At my institute you could also find the PASTA and TAPAS projects so apparently epidemiologists are good at naming things, or obsessed with food… But back to CHAI! This week Sean Lopp from RStudio wrote an interesting blog post about internal packages. I liked reading it and feeling good because we do have an internal R package for CHAI! In this blog post, I’ll explain what’s in there, in the hope of maybe providing inspiration for your own internal package!


As posted in this tweet, this pic represents the Barcelona contingent of CHAI, a really nice group to work with! We have other colleagues in India obviously, but also in the US.

How I became a crolute i.e. an user of the crul package

A few months ago rOpenSci’s Scott Chamberlain asked me for feedback about a new package of his called crul, an http client like httr, so basically something you use for e.g. writing a package interfacing an API. He told me that a great thing about crul was that it supports asynchronous requests. I felt utterly uncool because I had no idea what this meant although I had already written quite a few API packages (for instance ropenaq, riem and opencage).

So I googled the concept, my mind was blown and I decided that I’d trust Scott’s skills (spoiler: you can always do that) and just replace the httr dependency of ropenaq by crul. Why? First of all note that Crul is a planet in Star Wars whose male inhabitants are called crolutes which sound quite cool (there are female ones as well, called gilliands which doesn’t sound like the package name) and which I now use as a synonym for “user of the crul package”. But I had other reasons to switch… that was the subject of my lightning talk today at the French R conference in Anglet. In this blog post I’ll tell the story again, with a bit more details, in the hope to make you curious about crul!


Pic by ThinkR, thanks Colin/Diane/Vincent!

Automatic tools for improving R packages

On Tuesday I gave a talk at a meetup of the R users group of Barcelona. I got to choose the topic of my talk, and decided I’d like to expand a bit on a recent tweet of mine. There are tools that help you improve your R packages, some of them are not famous enough yet in my opinion, so I was happy to help spread the word! I published my slides online but thought that a blog post would be nice as well.

Who is talking about the French Open?

I don’t think rOpenSci’s Jeroen Ooms can ever top the coolness of his magick package but I have to admit other things he’s developped are not bad at all. He’s recently been working on interfaces to Google compact language detectors 2 and 3 (the latter being more experimental). I saw this cool use case and started thinking about other possible applications of the packages.

I was very sad when I realized it was too late to try and download tweets about the Eurovision song context but then I also remembered there’s this famous tennis tournament going on right now, about which people probably tweet in various languages. I don’t follow the French Open myself, but it seemed interesting to find out which languages were the most prevalent, and whether the results from the cld2 and cld3 packages are similar and whether they’re similar to the language detection results from Twitter itself.

Which science is all around? #BillMeetScienceTwitter

I’ll admit I didn’t really know who Bill Nye was before yesterday. His name sounds a bit like Bill Nighy’s, that’s all I knew. But well science is all around and quite often scientists on Twitter start interesting campaigns. Remember the #actuallylivingscientists whose animals I dedicated a blog post? This time, the Twitter campaign is the #BillMeetScienceTwitter hashtag with which scientists introduce themselves to the famous science TV host Bill Nye. Here is a nice article about the movement.

Since I like surfing on Twitter trends, I decided to download a few of these tweets and to use my own R interface to the Monkeylearn machine learning API, monkeylearn (part of the rOpenSci project!), to classify the tweets in the hope of finding the most represented science fields. So, which science is all around?

How not to make an evergreen review graph

In this post I am inspired by two tweets, mainly this one and also this one. Since the total number of articles every year is increasing, no matter which subject you choose, the curve representing number of articles as a function of year of publication will probably look exponential, so one should not use such graphs to impress readers. At least I’m not impressed, I’m more amused by such graphs now that there’s a hashtag for them.

I shall use an rOpenSci package for getting some data about number of articles about a query term, and to do a graph that’s not an evergreen review graph!

A tribute to Lucy D'Agostino McGowan's git commit emoji game

Do you know Lucy? She is a very talented biostatistics PhD candidate that I had the chance to e-meet thanks to R-Ladies. One maybe superficial reason to admire her, on top of her other achievements, is her emoji game in git commits. Looking at Lucy’s git history (find her on Github), one wants to start using version control because she makes it look fun!

In this post, I will download many git commit messages of Lucy’s from Github’s API via the gh package, and have a look at the emojis she uses the most frequently.