How Research is Made Reproducible
The aim of reproducible data analysis is that the analysis is documented and communicated in a way that other researchers can easily understand the procedure and reproduce the results. Even small steps are helpful in achieving this. You also minimise the effort required by the first evaluator and bring her/him advantages.
You can make your research reproducible by taking the following step:
- Prepare a study plan or protocol in advance. Carry out a pre-registration or submit a Registered Report.
- Choose reproducible tools and materials. The use of open source tools can promote reproducibility.
- Draft a reproducible project: Centralise and organise your project management by using an online platform, a central repository or a folder for all research files. For example, you can deposit all your project files in GitHub. Follow best practices in your central project by separating your data from your code in different folders. Make your raw data “read-only” and save it separately from processed data.
- Organise your data, files and folders: Use data name conventions, construct folder trees with a consistent, scalable structure, separate raw data from analysed data and so on.
- Learn the basics of version control, even if your own research does not require programming knowledge: The possibility to restore a specific version of a document that has been written over a period of several years can be very valuable.
- Automatise certain recurring tasks: You will raise the reliability of your results and make writing scientific articles easier, because you are able to vary parameters more easily.
- Automatise the editing and work processes for data analyses: Draft scripts to process your data and to manage your work steps. For example, avoid the use of tables for large datasets.
- Document your code and your data: Even if something is clear while you are working on it, it can be unclear two months later, even if you have written it yourself. Decide on open source solutions to achieve more transparency and guaranteed access.
- Learn about programming: Try using Jupyter Notebooks or other approaches to programming in order to integrate your code with your comments and documentation.
- Sharing and licensing your research: Make your data available via a repository. The following applies for software, notebooks and containers: License your code to find out how it can be (further) used.
- Make your research transparent: Describe and publish your methods and procedures clearly, transparently and fully, in order to enable replication.
You can find additional and more detailed tips on how to make your research reproducible here:
- The Turing Way, Guide for Reproducible Research
- Step-by-step-guide to achieving computational reproducibility using R and Markdown
- The paper “A Realistic Guide to Making Data Available Alongside Code to Improve Reproducibility” offers a practical, feasible guide on how to share data alongside research
- Open “Practical Steps Towards Open and Reproducible Research” from the Helmholtz Open Science Office