The title is a click-bait of course. This is like the logical sequel to my another post on how to create virtual conference videos.
I’ve read Anna Sophie Kümpel’s Twitter thread on her tips and resources that she finds “particularly helpful during manuscript preparation.”
This blog has the tendency to outlive social media services. To future proof this blog post, I need to provide a description of her Twitter thread: she provides some helpful tips on how to prepare an APA-compliant ICA paper. For example, she suggests using an APA Microsoft Word template, using a reference management software such as Zotero, checking statistical reporting using statcheck.io and sharing data and materials. Other tips include using the Academic Phrasebank and having a “proof buddy.”
Don’t get me wrong: Her tips are incredibly useful! I hope that someone would share with me these tips while I was in grad school. For example, I made an incredibly wrong decision to write my PhD thesis using a word processor from scratch (while my MSc thesis was prepared using \(\LaTeX\)). I will let you know the reason for that later in this article. But let’s call this reason X. I still remember the pain. A member of my thesis committee, having seen so many formatting issues in my thesis, questioned why my computation-intensive PhD research was not reported using \(\LaTeX\) or similar systems.
Half a decade after I handed in my formatting-issues-ridden, prepared-with-Microsoft-Word PhD thesis, here is how I prepare my ICA papers. Since last year, I’ve been using the method to prepare papers that I have much more autonomy, i.e. I am the first author. By doing so, I don’t need to care about the format. A big plus is that my papers will surely be computationally reproducible.
I also have experience in “reverse preparation”: to create a computationally reproducible paper from a Microsoft Word document. But that’s another story for a future time. Here are my 3 tips.
This looks like a no-brainer as R is THE computational engine. But it is not as straight forward as one might think.
Let’s study the possible counterfactuals. One counterfactual is to keep using a Word Processing System like Microsoft Word and then pasting the results from R to the Word Document. That’s actually my PhD thesis path. I don’t know why I need to suffer that pain again.
Another counterfactual is to use \(\LaTeX\). It is also a common choice for political scientists, i.e. people around me who are not communication researchers. Online collaboration tools such as Overleaf are available. It is also possible to mix R with LaTeX using Sweave. The reason for not making this counterfactual factual is that I find myself fiddling \(\LaTeX\) format more often that creating actual content. Also, \(\LaTeX\)’s markup is very visible. It is related to X.
Other counterfactuals are available. But RMarkdown has the right combination of ease and sophistication. It is now fairly easy to setup an RMarkdown editing environment. One only needs to install R, RStudio, some R packages. The required \(\LaTeX\) engine can also be easily installed using tinytex.
Similar to one Anna Sophie Kümpel’s tip, you will also need an APA template. My recommendation is papaja. As of writing, it has not yet been on CRAN. But it is fairly easy to install it from GitHub.
The great thing about papaja is that it has an extremely comprehensive manual.
As the document is now RMarkdown, it is pretty easy to mix code with content. As the R Code to generate the result section will be included in the RMarkdown file, there is no transcription involved. It can eliminate one big source of human error. There is no need for statcheck.io.
The paper is now fully reproducible and can also be exported to either .tex, .pdf and Microsoft Word. However, I usually assume the end product is .tex -> .pdf. So I write equations in \(\LaTeX\) too.
Bibliography will also be taken care of by \(\mathrm{B{\scriptstyle{IB}} \! T\!_{\displaystyle E} \! X}\). I maintain a (largish) \(\mathrm{B{\scriptstyle{IB}} \! T\!_{\displaystyle E} \! X}\) library. You can also maintain yours using Zotero, the tool that Anna Sophie Kümpel also recommends. You probably don’t need to care about the tools that I use to maintain my \(\mathrm{B{\scriptstyle{IB}} \! T\!_{\displaystyle E} \! X}\) library 1.
During the writing process, I cite from my main large \(\mathrm{B{\scriptstyle{IB}} \! T\!_{\displaystyle E} \! X}\) library. But I can’t make this permanent. When I am ready to share the document, I will extract the cited entries in the RMarkdown document using the R package condensebib. This process will generate a small \(\mathrm{B{\scriptstyle{IB}} \! T\!_{\displaystyle E} \! X}\) file which I can share together with the RMarkdown file.
So far so good. But let’s talk about X.
The main reason for my PhD thesis being in MS Word format was for collaboration. I wanted my supervisors to be able to comment my thesis in one single source of truth, so that my thesis was on Google Doc.
Following only the tips above will certainly bar my collaborators. More often than not, my collaborators are not as tech-savvy as me. This is the reason X I was talking about.
There is no service like Overleaf for RMarkdown 2. The knee-jerk response to the collaboration question is to use git and Github. RMarkdown is plain text. It’s ideal for version controlling, right? But my collaborators are not as tech-savvy, how can they handle git?
If I were patient enough, I would teach my collaborators to use git. But I am actually not patient. If I need to do this teaching, I rather go back to the painful path of not using RMarkdown.
Here, I need to make an assumption: the collaborators who can’t handle git actually care more about creating textual content than running code. I think it’s a reasonable assumption to make. Also, they are comfortable writing with Google Doc, not text editors such as RStudio.
So, a reasonable compromise is to use trackdown. The idea looks stupid but it works incredibly well in practice. The basic idea is to share the main textual content of an RMarkdown file on Google Doc. The code chunks and YAML are hidden. My collaborators can then edit the Google Doc like they do with a regular Google Doc. They can also use the regular Google Doc functions such as commenting and suggesting. I can keep track on the updates on the Google Doc locally using the function trackdown::update_file()
to update my local RMarkdown file.
Here is where the light-weight markup of RMarkdown is incredibly useful: it’s virtually transparent and the shared Google Doc looks like a regular one. Trackdown also supports Sweave, but I think my collaborators might be confused by “\section*{Introduction}” rather than by “# Introduction”.
I always hand in the PDF file generated from the RMarkdown file; not Word Document. A modern enough version of MS Word can actually open and edit the PDF file. To hide your identity for anonymous peerreview, simply set “mask” to “yes” in the yaml section of the RMarkdown file.
After handing in, I will make sure my RMarkdown file is reproducible from the data. And then I will condense the bib file, toss in a Make file to make sure the whole reproducible pipe line can be reproduce by typing make
. And then everything (RMarkdown file, the condensed BibTex file, Makefile, the data) can go to Github or OSF for dissemination.
Now you knew it, how I prepare my ICA papers. But it will neither improve your papers’ chance of being accepted, save you time, nor make you a better collaborator. So follow the above tips with care.
if you insist, I am using helm-bibtex. I also inject Bibtex entries into my Bibtex library using biblio.el by simply entering their DOI. The BibTex entries produced always have a DOI, a requirement nowadays of almost all communication journals. As these tools are within emacs, thus I said you don’t need to bother. ↩
Although collaborative editing tools for regular Markdown files such as hackmd exist, they are not quite useful for RMarkdown. ↩