Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Connection Metadata: Creating Rich Linkages

The strategies above describe when and how to publish research outputs, but there’s another crucial aspect: connection metadata. This is structured, machine-readable information that formally links your research outputs together through their persistent identifiers.

When you “add the DOI of the publication to the dataset” (see Reserve DOI), you’re creating connection metadata. But where exactly does this information go, and how do you create these links? This section provides practical guidance on using connection metadata to build a connected scholarly record.

For background on persistent identifiers and the infrastructure that makes these connections possible, see Persistent Identifiers.

You might wonder: isn’t it enough to mention the DOI in the paper text or README file? While that’s helpful, formal metadata links provide additional benefits:

When you have multiple related research outputs (a dataset, the code that analyzes it, the paper that describes it), you can formally link them in the PID metadata, for example, in DataCite metadata schema, this information is captured in the relatedIdentifier field.

Types of Relationships

PID metadata schema uses standardized relationship types to describe how resources relate to each other. Examples from the DataCite schema include:

Relationship TypesUse CaseExample
IsSupplementTo / IsSupplementedByWhen one output supplements anotherA dataset IsSupplementTo the paper that describes it
IsDerivedFrom / IsSourceOfWhen one output is derived from anotherA cleaned dataset IsDerivedFrom the raw data
Cites / IsCitedByFormal citation relationshipsA paper Cites a methodology it uses
IsPartOf / HasPartWhen outputs are components of a larger wholeIndividual datasets IsPartOf a data collection
References / IsReferencedByGeneral referencesDocumentation References a protocol
IsVersionOf / HasVersionConnections between versionsA is a version of B
IsNewVersionOf / IsPreviousVersionOfSpecific version successionVersion 2 IsNewVersionOf version 1
Compiles / IsCompiledByWhen code or workflows produce outputsAn analysis script Compiles a figure

For a complete list, see DataCite’s relatedIdentifier documentation.

The specific process depends on which repository you’re using, but the general approach is similar:

During initial deposit:

  1. When uploading your research output, look for fields like:

    • “Related identifiers”

    • “Related works”

    • “Linked resources”

    • “Relations”

  2. Add the DOI of the related resource

  3. Select the appropriate relationship type from a dropdown menu

  4. Specify the direction if needed

After initial deposit:

  1. Contact the repository’s support team or use the metadata update form

  2. Provide the DOI of your resource and the related DOI

  3. Specify the relationship type

  4. The repository will update the metadata

Repository-Specific Examples

Zenodo
Figshare
OSF
Dryad
  • When creating or editing a record, scroll to “Related identifiers/works”

  • Click “Add related identifier”

  • Enter the identifier (DOI, arXiv ID, and so on)

  • Choose relationship type from dropdown

  • Choose resource type of the related item

  • The interface shows both sides (for example, “this dataset is supplement to that publication”)

For institutional repositories:

Best Practices for Linking Outputs

Example: Connecting a Complete Research Project

The Implementing FAIR Workflows project is registered with a project ID (Cousijn & Melloni (2021)), which links to the ORCIDs of the project team members, the ROR ID of partner organizations, and the outputs of the project - such as registeration, datasets, software, reports - are openly shared with DOI. DataCite Commons, the DOI metadata discovery portal aggregates the metadata of all project related resources and present a full overview: Cousijn & Melloni (2021)

Connecting Research to Funding

Research outputs can be formally linked to the grants that funded them through fundingReference metadata. This creates a traceable connection from funding organizations through researchers to the resulting outputs.

Why Cite Funding?

You might already acknowledge funding in your papers, but structured funding metadata provides additional benefits:

Acknowledgment vs. Citation

There’s an important distinction:

Acknowledgment (text-based):

Citation (structured metadata):

Both are valuable, but citation through metadata enables much more powerful tracking and discovery.

Components of Funding Metadata

Complete funding metadata includes:

Funder Information:

Grant Information:

How to Add Funding Information

During deposit:

  1. Most repositories have a “Funding” section in their upload forms

  2. Start typing the funder name - many repositories auto-complete from the Funder Registry

  3. Add the grant/award number

  4. Add grant title if there’s a field for it

  5. Repeat for additional funders (many projects have multiple funding sources)

After deposit:

  1. Contact the repository to update metadata

  2. Provide complete funding information:

    • Funder name (and Funder ID if you know it)

    • Grant number

    • Grant title

Repository-specific guidance:

Zenodo
Figshare
Dryad
OSF
  • “Funding” section during upload

  • Type-ahead search of Funder Registry

  • Add multiple funders

  • Fields for grant number and details

Best Practices for Funding Citation

Example Funding Metadata

A dataset might include:

Funder: UK Research and Innovation
Funder Identifier: https://ror.org/001aqnf71
Award Number: MR/V012345/1
Award Title: Understanding Climate Change Impacts on Grassland Ecosystems

Funder: Natural Environment Research Council
Funder Identifier: https://ror.org/02b5d8509
Award Number: NE/X067891/1

People and Organizations in Metadata

Just as research outputs have persistent identifiers, so do people and organizations. Including these identifiers in your research output metadata creates a rich network of connections.

ORCID: Persistent Identifiers for Researchers

ORCID provides unique identifiers for researchers that distinguish them from everyone else, even people with identical names.

For comprehensive guidance on ORCID, see our dedicated ORCID chapter.

Key points for connection metadata:

Format: ORCIDs are 16-digit numbers formatted as 0000-0001-2345-6789 As URLs: https://orcid.org/0000-0001-2345-6789

ROR: Persistent Identifiers for Organizations

ROR provides unique identifiers for research institutions.

Uses in metadata:

Format: ROR IDs are URLs like https://ror.org/013meh722 (University of Cambridge)

How to add:

Benefits of Including People and Organization Identifiers

For researchers:

For institutions:

For the research community:

Building a Fully Connected Scholarly Record

When you combine all these types of connection metadata, you create a rich, queryable network:

A dataset (DataCite DOI) can be connected to:

This interconnected graph enables:

For more on the infrastructure enabling these connections, see Persistent Identifiers.

References
  1. Cousijn, H., & Melloni, L. (2021). Implementing FAIR Workflows: A Proof of Concept Study in the Field of Consciousness. DataCite. 10.60581/ZAEV-6P15