First commit or initial commit?
When I create a new .git repository, my first commit message tends to be “1st commit”. I’ve been wondering what other people use as initial commit message. Today I used the gh package to get first commits of all repositories of the ropensci and ropenscilabs organizations.
Bar bar plots but not Babar plots
You might have heard of the “bar bar plots” movement whose goal is to prevent using (let’s use ggplot2 language shall we) geom_bar when you could have used e.g. geom_boxplot or geom_histogram because the bar plot hides the variability of the distributions you say you’re comparing, even if you add some sort of error bar. I whole-heartedly agree with this movement but in this post I’d like to present some ok bar plots, that represent counts of individuals in different categories. You might know them as geom_bar(blabla, stat = "identity") or geom_col. They’re still bar plots and they’re great, in particular when you make puns out of them which is what I did with Miles McBain.
A plot against the CatterPlots complot
In these terrible times, we R people have more important subjects to debate/care about than ggplot2 vs. base R graphics (isn’t even worth discussing anyway, ggplot2 is clearly the best alternative). Or so I thought until I saw CatterPlots trending on Twitter this week and even being featured on Revolutions blog. It was cool because plots with cats are cool, but looking more closely at the syntax of CatterPlots, I couldn’t but realize it was probably a complot to make us all like base R graphics syntax again! So let me show you how to make a cute plot with the awesome ggplot2 extension emojifont.
Extracting notable deaths from Wikipedia
I like Wikipedia. My husband likes it even more, he included it in his PhD thesis acknowledgements! I appreciate the efforts done for sharing knowledge, and also the apparently random stuff you can find on the website. In particular, I’ve been intrigued by the monthly lists of notable deaths such as this one. Who are people (or dogs, yes, dogs) whose life was deemed notable enough to be listed there? Also, using the numbers of such deaths, can I judge whether 2016 was really worse than previous years? The first step in answering these questions was to scrape the data. I’ll describe the process in this post. In another post I’ll have a look at my study population and in a third post I’ll analyse the time series of death counts.
Were there more notable deaths than expected in 2016?
After exploring my study population of Wikipedia deaths, I want to analyse the time series of monthly counts of notable deaths. This is not a random interest of mine, my PhD thesis was about monitoring time series of count, the application being weekly number of reported cases of various diseases.