Maëlle's R blog

Showcase of my (mostly R) work/fun

Spookify: Halloween Name Generation in R

It’s October, time for spooky Twitter names! If you’re on this social media platform, you might have noticed some of your friends switching their names to something spooky and punny. Last year I was “Maelstrom Salmon”, which I find scary but is arguably not that funny. Anyhow, what if you want to switch your name but have no inspiration? In this post, we shall explore R’s abilities to help us with that with the help of webscraping, phonetic spelling and string distance algorithms, and the magic of randomness!

O'Reilly animals in trouble? Conservation status of book covers

What can a kaka, a kakapo, an European rabbit and a grey heron have in common? Well, they might co-habit in the bookshelf of an R user, since they’re all animals on the covers of popular R books: “R Packages”, “R for Data Science”, “Text mining with R” and “Efficient R programming”, respectively. Their publisher, O’Reilly, has now based its brand on covers featuring beautiful gravures of animals.

Recently, while wondering what the name of R for Data Science bird was again (I thought it was a kea!), I was thrilled to find the whole O’Reilly menagerie, i.e. a list of books and corresponding animals! The website also features a link to “A short history of the O’Reilly animals” that was an amazing read. In it was noted that “The animals are in trouble.”, with a few examples of endangered species. It inspired me to actually try and assess the conservation status of O’Reilly animals using responsible webscraping, taxonomic name resolving and IUCN Redlist API querying…

ALLSTATisticians in decline? A polite look at ALLSTAT email Archives

I was until recently subscribed to an email list, ALLSTAT, “A UK-based worldwide e-mail broadcast system for the statistical community, operated by ICSE for HEA Statistics.” created in 1998. That’s how I saw the ad for my previous job in Barcelona! Now, I dislike emails more and more so I unsubscribed, but I’d still check out the archives any time I need a job, since many messages are related to openings. Nowadays, I probably identify more as a research software engineer or data scientist than a statistician… which made me wonder, when did ALLSTAT start featuring data scientist jobs? How do their frequency compare to those of statisticians?

In this post, I’ll webscrape and analyse meta-data of ALLSTAT emails. It’ll also be the occasion for me to take the wonderful new polite package for a ride, that helps respectful webscraping!

Where to get help with your R question?

Last time I blogged, I offered my obnoxious helpful advice for blog content and promotion. Today, let me again be the agony aunt you didn’t even write to! Imagine you have an R question, i.e. a question related to how you can do something with R, and your search engine efforts haven’t been too successful: where should you ask it to increase your chance of its getting answered? You could see this post as my future answer to stray suboptimal Twitter R questions, or as a more general version of Scott Chamberlain’s excellent talk about how to get help related to rOpenSci software in the 2017-03-07 rOpenSci comm call.

I think that the general journey to getting answers to your R questions is first trying your best to get answers locally in the documentation of R, then to use a search engine, and then to post a well-formulated question somewhere. My post is aimed at helping you find that somewhere. Note that it’s considered best practice to ask in one somewhere at once, and to then move on to another somewhere if you haven’t got any answer… or if someone kindly redirects you to a better venue!

Get on your soapbox! R blog content and promotion

This year I had the chance to speak at two R-Ladies meetups (I might have invited myself to those meetups to make the most of being in town, create your own happiness!), one in Cape Town in March, one in Seattle in May. It was a blast both times! I gave the same talk twice, and decided it was about time to write it up.

My talk in March was aimed at pairing, well tripleting, with Marie Dussault’s talk about setting up your blogdown website, and Stephanie Kovalchik’s talk about sports blogging so it is not about these topics. What it is is my non data-driven, quite personal view on blog content and promotion, which hopefully features some useful tips for any wannabe R blogger!