Making Open Code Citable
Your code or software should be published in an appropriate form, so that others can cite it. In brief:
- In a code repository such as GitHub or GitLab (without DOI)
- In a subject-specific repository for data and code of a journal (with DOI)
- In a specific journal for code packages and software (with DOI)
- On platforms such as Zenodo and Figshare (with DOI).
- On the platform Software Heritage (with SWHID).
In all cases, you will need to specify different metadata. The second and third points refer to a classic research publication process, where a DOI and a citation are of course provided.
If you use GitHub or GitLab, each code change or version ("commit") can be referenced or cited via a unique URL. GitLab also offers a free open-source version for self-hosting. Important versions, such as certain milestones, should be labelled as a release. You can publish a release on Zenodo or Figshare, where you will also receive a DOI and a suitable suggestion for a citation based on the specified metadata. GitHub already has a Zenodo integration, so you can easily publish a release there. You can read how this works here: GitHub Docs: Referencing and citing content. If you publish another release from the same GitHub repository on Zenodo, a new version of the existing Zenodo entry is automatically created. All releases receive their own DOI and are grouped under a main DOI.
An alternative to DOI-based citation is offered by Software Heritage. By archiving a repository on Software Heritage, you receive a Software Hash Identifier (SWHID), that is, an intrinsic, content-based persistent identifier that encodes the actual content of the code. Unlike a GitHub or a GitLab URL, a SWHID is not tied to a platform and carries information about exactly what is being referenced. SWHIDs can point to an entire repository snapshot, a specific directory, a single file, or even a defined range of lines within a file. This makes them particularly useful when precise, granular citation of code is required.
As a rule, suggestions for a citation are based on the metadata provided. Of course, you can also store citation information in a README file, but this may not be found directly. Another option for storing the type of citation required is to use the Citation File Format (CFF). Human- and machine-readable citation information for code and software (and also data records) can be stored in a CITATION.cff file. GitHub and Zenodo automatically take this information into account. The CFFInit tool is also available for creating a CITATION.cff file.