I’m currently on a quest to better know and understand treesitter-based tooling for R. To make it short, treesitter is a tool for parsing code, for instance recognizing what is a function, an argument, a logical in a string of code. With tools built upon treesitter you can search, reformat, lint and fix, etc. your code. Exciting stuff, running locally and deterministically on your machine.
Speaking of “etc.”, Etienne Bacher helpfully suggested I also look at treesitter-based tooling for other languages to see what’s still missing in our ecosystem. This is how I stumbled upon difftastic by Wilfred Hughes, “a structural diff tool that understands syntax”. ✨ This means that difftastic doesn’t only compare line or “words” but actual syntax by looking at lines around the lines that changed (by default, 3), Even better, it understands R out of the box1.
Many thanks to Etienne Bacher not only for making me discover difftastic but also for useful feedback on this post!
Installing difftastic
To install difftastic I downloaded a binary file for my system from the releases of the GitHub repository, as documented in the manual.
difftastic on two files
You can run difftastic on two files, a bit like you would use the waldo R package on two objects.
Let’s compare:
a <- gsub("bad", "good", x)
to
a <- stringr::str_replace(x, "bad", "good")
respectedly saved in old.R and new.R.
The CLI is called difft not difftastic.
I use the “inline” display rather than the two columns default in order to save horizontal space.
difft old.R new.R --display inline
We’d get to this nice looking diff:
The parentheses and "bad" and "good" arguments are ignored.
We can also get the JSON version of this diff, which is an unstable feature which usage requires setting an environment variable:
export DFT_UNSTABLE=yes
difft old.R new.R --display json
This gets us
{"aligned_lines":[[0,0],[1,1]],"chunks":[[{"lhs":{"line_number":0,"changes":[{"start":5,"end":9,"content":"gsub","highlight":"normal"},{"start":23,"end":24,"content":",","highlight":"normal"},{"start":25,"end":26,"content":"x","highlight":"normal"}]},"rhs":{"line_number":0,"changes":[{"start":5,"end":12,"content":"stringr","highlight":"normal"},{"start":12,"end":14,"content":"::","highlight":"keyword"},{"start":14,"end":25,"content":"str_replace","highlight":"normal"},{"start":26,"end":27,"content":"x","highlight":"normal"},{"start":27,"end":28,"content":",","highlight":"normal"}]}}]],"language":"R","path":"content/post/2026-03-26-difftastic/new.R","status":"changed"}
Now, none of this isn’t very useful because I would never compare files in this way… I use version control!
difftastic with Git
We can set difftastic as the external diff tool for Git globally or for the current project.
For instance with the gert R package, to set it locally:
gert::git_config_set("diff.external", "difft")
If I want to use the inline display I’d set:
gert::git_config_set("diff.external", "difft --display inline")
Then git diff will by default use difftastic.
Most interestingly for me, git show --ext-diff will use difftastic.
I never use git diff directly but I do look at more or less recent commits a lot.
Say I am interested in the commit that removed roxygen2’s dependency on stringi, I’ll run:
git show 7a1dd39866699a2b0a034bb15244c07698a1e2e7 --ext-diff
and get:
This isn’t spectacular because this is a small diff, but I enjoy the highlighting of the parentheses of the removed nested call, and of the logical.
Cool features of difftastic
Building on two examples of the difftastic homepage…
Ignoring formatting changes
Since formatters can so helpfully apply your formatting preferences, reviewing formatting changes in a patch that’s about something else entirely is useless and annoying. Imagine having a function definition that fits on a single line, then adding one argument to it.
Going from
f <- function(myarg1 = foo, myarg2 = bar) {}
to
f <- function(
myarg1 = foo,
myarg2 = bar,
myarg3 = baz
) {}
Because the definition is now longer than 80 characters, your formatter might switch the definition to be on multiple lines. But the actually interesting change is the addition of one argument.
Native Git diff2 would show:
Git with difftastic would show:
The matching of delimiters is why I found the difftastic’s display of the roxygen2 commit more pleasing.
Matching delimiters in wrappers
The Git diff can look a bit ugly when you simply move code from one function to the other.
Say we go from
f <- function() {
1 + 1
}
to
f <- function() {
g()
}
g <- function() {
1 + 1
}
Git diff would show:
Whereas Git with difftastic would show:
Will I use difftastic?
I really like the concept behind difftastic and the few Git commits I looked at with it rendered nicely. Now, what’s missing for me to use difftastic a lot is its integration with the tools where I actually use Git:
- Positron including the GitLens extension;
- GitHub Pull Request Files tab.
In any case, I’ll continue learning about tools based on treesitter, some of which like Air and Jarl I can already use directly from my IDE. 😸
-
It’s not every day we R developers look at the homepage of a tool and see the R logo among the logos of other languages! ↩︎
-
To get the diff that Git would show me I ran
git diff --no-index old-args.R new-args.R --no-ext-diff, cool trick I didn’t know about! Very glad I didn’t have to create a fake Git repo just for this. (--no-ext-diffbecause my diff in this repo would use difftastic by default!) ↩︎