Connection Metadata: Creating Rich Linkages

The strategies above describe when and how to publish research outputs, but there’s another crucial aspect: connection metadata. This is structured, machine-readable information that formally links your research outputs together through their persistent identifiers.

When you “add the DOI of the publication to the dataset” (see Reserve DOI), you’re creating connection metadata. But where exactly does this information go, and how do you create these links? This section provides practical guidance on using connection metadata to build a connected scholarly record.

For background on persistent identifiers and the infrastructure that makes these connections possible, see Persistent Identifiers.

Why Link Through Metadata?¶

You might wonder: isn’t it enough to mention the DOI in the paper text or README file? While that’s helpful, formal metadata links provide additional benefits:

Machine-readable: Automated systems can discover all outputs related to a paper or dataset
Bidirectional: Both resources show the connection, not just one
Typed relationships: The metadata specifies what kind of relationship exists
Aggregatable: Citation databases and metrics systems can trace the full impact
Discoverable: Finding one output helps users discover all related outputs

When you have multiple related research outputs (a dataset, the code that analyzes it, the paper that describes it), you can formally link them in the PID metadata, for example, in DataCite metadata schema, this information is captured in the relatedIdentifier field.

Types of Relationships¶

PID metadata schema uses standardized relationship types to describe how resources relate to each other. Examples from the DataCite schema include:

Relationship Types	Use Case	Example
IsSupplementTo / IsSupplementedBy	When one output supplements another	A dataset IsSupplementTo the paper that describes it
IsDerivedFrom / IsSourceOf	When one output is derived from another	A cleaned dataset IsDerivedFrom the raw data
Cites / IsCitedBy	Formal citation relationships	A paper Cites a methodology it uses
IsPartOf / HasPart	When outputs are components of a larger whole	Individual datasets IsPartOf a data collection
References / IsReferencedBy	General references	Documentation References a protocol
IsVersionOf / HasVersion	Connections between versions	A is a version of B
IsNewVersionOf / IsPreviousVersionOf	Specific version succession	Version 2 IsNewVersionOf version 1
Compiles / IsCompiledBy	When code or workflows produce outputs	An analysis script Compiles a figure

For a complete list, see DataCite’s relatedIdentifier documentation.

The specific process depends on which repository you’re using, but the general approach is similar:

During initial deposit:¶

When uploading your research output, look for fields like:
- “Related identifiers”
- “Related works”
- “Linked resources”
- “Relations”
Add the DOI of the related resource
Select the appropriate relationship type from a dropdown menu
Specify the direction if needed

After initial deposit:¶

Contact the repository’s support team or use the metadata update form
Provide the DOI of your resource and the related DOI
Specify the relationship type
The repository will update the metadata

Repository-Specific Examples¶

Zenodo

Figshare

OSF

Dryad

When creating or editing a record, scroll to “Related identifiers/works”
Click “Add related identifier”
Enter the identifier (DOI, arXiv ID, and so on)
Choose relationship type from dropdown
Choose resource type of the related item
The interface shows both sides (for example, “this dataset is supplement to that publication”)

For institutional repositories:

Check your repository’s documentation or contact repository staff
They can add related identifier metadata even if there’s no self-service option

Best Practices for Linking Outputs¶

Link early: Add relationships when you first deposit outputs, or update metadata as soon as related outputs are published
Link bidirectionally: If a dataset supplements a paper, add the link from both sides if possible
Be specific: Use the most precise relationship type (IsSupplementTo is more specific than References)
Link comprehensively: Connect all related outputs (data, code, paper, protocols, presentations)
Update when publishing: If you reserved a DOI for a paper, update data/code metadata once the paper has its final DOI

Example: Connecting a Complete Research Project¶

The Implementing FAIR Workflows project is registered with a project ID (Cousijn & Melloni (2021)), which links to the ORCIDs of the project team members, the ROR ID of partner organizations, and the outputs of the project - such as registeration, datasets, software, reports - are openly shared with DOI. DataCite Commons, the DOI metadata discovery portal aggregates the metadata of all project related resources and present a full overview: Cousijn & Melloni (2021)

Connecting Research to Funding¶

Research outputs can be formally linked to the grants that funded them through fundingReference metadata. This creates a traceable connection from funding organizations through researchers to the resulting outputs.

Why Cite Funding?¶

You might already acknowledge funding in your papers, but structured funding metadata provides additional benefits:

Funder reporting: Automated systems can discover all outputs from a grant
Impact tracking: Funders can measure return on investment
Transparency: The public can see how research funds are used
Credit: Proper attribution for funding sources
Discovery: Others researching similar topics can find related funded work

Acknowledgment vs. Citation¶

There’s an important distinction:

Acknowledgment (text-based):¶

Written in the acknowledgments section of a paper
Example: “This work was supported by grant ABC-123 from the Example Foundation”
Not machine-readable
Not standardized
Often inconsistent across outputs from the same grant

Citation (structured metadata):¶

Included in the PID metadata using fundingReference fields
Machine-readable and standardized
Searchable and aggregatable
Links to persistent identifiers for the funder and potentially the grant

Both are valuable, but citation through metadata enables much more powerful tracking and discovery.

Components of Funding Metadata¶

Complete funding metadata includes:

Funder Information:

Funder Name: The organization providing funding
Funder Identifier: A persistent identifier for the funder (usually from Crossref Funder Registry)
- Example: National Science Foundation = https://doi.org/10.13039/00000001
- Example: Wellcome Trust = https://doi.org/10.13039/00000035

Grant Information:

Award Number: The specific grant identifier
- Example: “NE/X012345/1” or “R01-GM12345”
Award Title: The title of the funded project (optional but helpful)
Award URI: A persistent identifier for the grant itself, if available

How to Add Funding Information¶

During deposit:

Most repositories have a “Funding” section in their upload forms
Start typing the funder name - many repositories auto-complete from the Funder Registry
Add the grant/award number
Add grant title if there’s a field for it
Repeat for additional funders (many projects have multiple funding sources)

After deposit:

Contact the repository to update metadata
Provide complete funding information:
- Funder name (and Funder ID if you know it)
- Grant number
- Grant title

Repository-specific guidance:

Zenodo

Figshare

Dryad

OSF

“Funding” section during upload
Type-ahead search of Funder Registry
Add multiple funders
Fields for grant number and details

Best Practices for Funding Citation¶

Add funding to all outputs: Not just papers - include in data, code, protocols, and other outputs
Use official grant numbers: Match the funder’s format exactly
Include all funders: If multiple organizations contributed, cite them all
Add early: Include funding information when you first deposit outputs
Be consistent: Use the same grant number format across all outputs from that grant
Check with your funder: Some funders have specific requirements for how grants should be cited

Example Funding Metadata¶

A dataset might include:

Funder: UK Research and Innovation
Funder Identifier: https://ror.org/001aqnf71
Award Number: MR/V012345/1
Award Title: Understanding Climate Change Impacts on Grassland Ecosystems

Funder: Natural Environment Research Council
Funder Identifier: https://ror.org/02b5d8509
Award Number: NE/X067891/1

People and Organizations in Metadata¶

Just as research outputs have persistent identifiers, so do people and organizations. Including these identifiers in your research output metadata creates a rich network of connections.

ORCID: Persistent Identifiers for Researchers¶

ORCID provides unique identifiers for researchers that distinguish them from everyone else, even people with identical names.

For comprehensive guidance on ORCID, see our dedicated ORCID chapter.

Key points for connection metadata:

Include your ORCID when depositing research outputs
Add ORCIDs for all co-authors when possible
The ORCID connects all your outputs across different repositories and publications
Your ORCID profile can automatically collect works that cite your ORCID

Format: ORCIDs are 16-digit numbers formatted as 0000-0001-2345-6789 As URLs: https://orcid.org/0000-0001-2345-6789

ROR: Persistent Identifiers for Organizations¶

ROR provides unique identifiers for research institutions.

Uses in metadata:

Author/creator affiliations
Contributor affiliations
Institution hosting the research
Partner organizations in collaborations

Format: ROR IDs are URLs like https://ror.org/013meh722 (University of Cambridge)

How to add:

Some repositories support ROR ID fields for affiliations
When creating or editing records, look for organization/affiliation fields
Search the ROR Registry to find your institution’s ID

Benefits of Including People and Organization Identifiers¶

For researchers:

All your work is connected, regardless of name changes or institutional moves
Easier to prove your contributions for promotion, tenure, or grant applications
Automatic collection of citations and reuse
Proper disambiguation from others with similar names

For institutions:

Track all research outputs from the institution
Demonstrate research impact and productivity
Support reporting requirements
Identify collaboration networks

For the research community:

Discover all work by a researcher or institution
Understand collaboration patterns
Track research mobility
Measure impacts more accurately

Building a Fully Connected Scholarly Record¶

When you combine all these types of connection metadata, you create a rich, queryable network:

A dataset (DataCite DOI) can be connected to:

The researchers who created it (ORCIDs)
Their institutions (ROR IDs)
The grants that funded it (Funder IDs + award numbers)
The paper describing it (Crossref DOI)
The code that analyzes it (DataCite DOI)
The protocol used to collect it (DataCite DOI)
Previous and new versions of itself (version relationships)

This interconnected graph enables:

Comprehensive discovery (find all related materials)
Complete attribution (credit all contributors)
Impact measurement (trace outputs from funding to reuse)
Reproducibility (access all components needed)
Trust (transparent provenance and connections)

For more on the infrastructure enabling these connections, see Persistent Identifiers.

References¶

Cousijn, H., & Melloni, L. (2021). Implementing FAIR Workflows: A Proof of Concept Study in the Field of Consciousness. DataCite. 10.60581/ZAEV-6P15

Connection Metadata: Creating Rich Linkages

Why Link Through Metadata?¶

Linking Research Outputs¶

Types of Relationships¶

How to Add Related Identifiers¶

During initial deposit:¶

After initial deposit:¶

Repository-Specific Examples¶

Best Practices for Linking Outputs¶

Example: Connecting a Complete Research Project¶

Connecting Research to Funding¶

Why Cite Funding?¶

Acknowledgment vs. Citation¶

Acknowledgment (text-based):¶

Citation (structured metadata):¶

Components of Funding Metadata¶

How to Add Funding Information¶

Best Practices for Funding Citation¶

Example Funding Metadata¶

People and Organizations in Metadata¶

ORCID: Persistent Identifiers for Researchers¶

ROR: Persistent Identifiers for Organizations¶

Benefits of Including People and Organization Identifiers¶

Building a Fully Connected Scholarly Record¶