The strategies above describe when and how to publish research outputs, but there’s another crucial aspect: connection metadata. This is structured, machine-readable information that formally links your research outputs together through their persistent identifiers.
When you “add the DOI of the publication to the dataset” (see Reserve DOI), you’re creating connection metadata. But where exactly does this information go, and how do you create these links? This section provides practical guidance on using connection metadata to build a connected scholarly record.
For background on persistent identifiers and the infrastructure that makes these connections possible, see Persistent Identifiers.
Why Link Through Metadata?¶
You might wonder: isn’t it enough to mention the DOI in the paper text or README file? While that’s helpful, formal metadata links provide additional benefits:
Machine-readable: Automated systems can discover all outputs related to a paper or dataset
Bidirectional: Both resources show the connection, not just one
Typed relationships: The metadata specifies what kind of relationship exists
Aggregatable: Citation databases and metrics systems can trace the full impact
Discoverable: Finding one output helps users discover all related outputs
Linking Research Outputs¶
When you have multiple related research outputs (a dataset, the code that analyzes it, the paper that describes it), you can formally link them in the PID metadata, for example, in DataCite metadata schema, this information is captured in the relatedIdentifier field.
Types of Relationships¶
PID metadata schema uses standardized relationship types to describe how resources relate to each other. Examples from the DataCite schema include:
| Relationship Types | Use Case | Example |
|---|---|---|
| IsSupplementTo / IsSupplementedBy | When one output supplements another | A dataset IsSupplementTo the paper that describes it |
| IsDerivedFrom / IsSourceOf | When one output is derived from another | A cleaned dataset IsDerivedFrom the raw data |
| Cites / IsCitedBy | Formal citation relationships | A paper Cites a methodology it uses |
| IsPartOf / HasPart | When outputs are components of a larger whole | Individual datasets IsPartOf a data collection |
| References / IsReferencedBy | General references | Documentation References a protocol |
| IsVersionOf / HasVersion | Connections between versions | A is a version of B |
| IsNewVersionOf / IsPreviousVersionOf | Specific version succession | Version 2 IsNewVersionOf version 1 |
| Compiles / IsCompiledBy | When code or workflows produce outputs | An analysis script Compiles a figure |
For a complete list, see DataCite’s relatedIdentifier documentation.
How to Add Related Identifiers¶
The specific process depends on which repository you’re using, but the general approach is similar:
During initial deposit:¶
When uploading your research output, look for fields like:
“Related identifiers”
“Related works”
“Linked resources”
“Relations”
Add the DOI of the related resource
Select the appropriate relationship type from a dropdown menu
Specify the direction if needed
After initial deposit:¶
Contact the repository’s support team or use the metadata update form
Provide the DOI of your resource and the related DOI
Specify the relationship type
The repository will update the metadata
Repository-Specific Examples¶
When creating or editing a record, scroll to “Related identifiers/works”
Click “Add related identifier”
Enter the identifier (DOI, arXiv ID, and so on)
Choose relationship type from dropdown
Choose resource type of the related item
The interface shows both sides (for example, “this dataset is supplement to that publication”)
In the item editor, find the “Links” or “Related items” section
Add the DOI of related resources
Select relationship type
Figshare may have limited relationship type options compared to full DataCite schema
In a project or component, use the “Links” widget
Add external links including DOIs
Describe the relationship in the link description
OSF’s native connections between project components are automatic
During submission, there’s a “Related Works” section
Add manuscript DOIs automatically if submitting through journal integration
Add additional related resources manually with relationship types
For institutional repositories:
Check your repository’s documentation or contact repository staff
They can add related identifier metadata even if there’s no self-service option
Best Practices for Linking Outputs¶
Link early: Add relationships when you first deposit outputs, or update metadata as soon as related outputs are published
Link bidirectionally: If a dataset supplements a paper, add the link from both sides if possible
Be specific: Use the most precise relationship type (IsSupplementTo is more specific than References)
Link comprehensively: Connect all related outputs (data, code, paper, protocols, presentations)
Update when publishing: If you reserved a DOI for a paper, update data/code metadata once the paper has its final DOI
Example: Connecting a Complete Research Project¶
The Implementing FAIR Workflows project is registered with a project ID (Cousijn & Melloni (2021)), which links to the ORCIDs of the project team members, the ROR ID of partner organizations, and the outputs of the project - such as registeration, datasets, software, reports - are openly shared with DOI. DataCite Commons, the DOI metadata discovery portal aggregates the metadata of all project related resources and present a full overview: Cousijn & Melloni (2021)
Connecting Research to Funding¶
Research outputs can be formally linked to the grants that funded them through fundingReference metadata. This creates a traceable connection from funding organizations through researchers to the resulting outputs.
Why Cite Funding?¶
You might already acknowledge funding in your papers, but structured funding metadata provides additional benefits:
Funder reporting: Automated systems can discover all outputs from a grant
Impact tracking: Funders can measure return on investment
Transparency: The public can see how research funds are used
Credit: Proper attribution for funding sources
Discovery: Others researching similar topics can find related funded work
Acknowledgment vs. Citation¶
There’s an important distinction:
Acknowledgment (text-based):¶
Written in the acknowledgments section of a paper
Example: “This work was supported by grant ABC-123 from the Example Foundation”
Not machine-readable
Not standardized
Often inconsistent across outputs from the same grant
Citation (structured metadata):¶
Included in the PID metadata using fundingReference fields
Machine-readable and standardized
Searchable and aggregatable
Links to persistent identifiers for the funder and potentially the grant
Both are valuable, but citation through metadata enables much more powerful tracking and discovery.
Components of Funding Metadata¶
Complete funding metadata includes:
Funder Information:
Funder Name: The organization providing funding
Funder Identifier: A persistent identifier for the funder (usually from Crossref Funder Registry)
Example: National Science Foundation =
https://doi.org/10.13039/00000001Example: Wellcome Trust =
https://doi.org/10.13039/00000035
Grant Information:
Award Number: The specific grant identifier
Example: “NE/X012345/1” or “R01-GM12345”
Award Title: The title of the funded project (optional but helpful)
Award URI: A persistent identifier for the grant itself, if available
How to Add Funding Information¶
During deposit:
Most repositories have a “Funding” section in their upload forms
Start typing the funder name - many repositories auto-complete from the Funder Registry
Add the grant/award number
Add grant title if there’s a field for it
Repeat for additional funders (many projects have multiple funding sources)
After deposit:
Contact the repository to update metadata
Provide complete funding information:
Funder name (and Funder ID if you know it)
Grant number
Grant title
Repository-specific guidance:
“Funding” section during upload
Type-ahead search of Funder Registry
Add multiple funders
Fields for grant number and details
“Funding” field in metadata editor
Free text, but try to match official grant information
“Funding information” during submission
Connected to journal submission data when available
Not currently supported in core OSF metadata
Can include in project description/wiki
Supported in some OSF-integrated services
Best Practices for Funding Citation¶
Add funding to all outputs: Not just papers - include in data, code, protocols, and other outputs
Use official grant numbers: Match the funder’s format exactly
Include all funders: If multiple organizations contributed, cite them all
Add early: Include funding information when you first deposit outputs
Be consistent: Use the same grant number format across all outputs from that grant
Check with your funder: Some funders have specific requirements for how grants should be cited
Example Funding Metadata¶
A dataset might include:
Funder: UK Research and Innovation
Funder Identifier: https://ror.org/001aqnf71
Award Number: MR/V012345/1
Award Title: Understanding Climate Change Impacts on Grassland Ecosystems
Funder: Natural Environment Research Council
Funder Identifier: https://ror.org/02b5d8509
Award Number: NE/X067891/1People and Organizations in Metadata¶
Just as research outputs have persistent identifiers, so do people and organizations. Including these identifiers in your research output metadata creates a rich network of connections.
ORCID: Persistent Identifiers for Researchers¶
ORCID provides unique identifiers for researchers that distinguish them from everyone else, even people with identical names.
For comprehensive guidance on ORCID, see our dedicated ORCID chapter.
Key points for connection metadata:
Include your ORCID when depositing research outputs
Add ORCIDs for all co-authors when possible
The ORCID connects all your outputs across different repositories and publications
Your ORCID profile can automatically collect works that cite your ORCID
Format: ORCIDs are 16-digit numbers formatted as 0000-0001-2345-6789
As URLs: https://orcid.org/0000-0001-2345-6789
ROR: Persistent Identifiers for Organizations¶
ROR provides unique identifiers for research institutions.
Uses in metadata:
Author/creator affiliations
Contributor affiliations
Institution hosting the research
Partner organizations in collaborations
Format: ROR IDs are URLs like https://ror.org/013meh722 (University of Cambridge)
How to add:
Some repositories support ROR ID fields for affiliations
When creating or editing records, look for organization/affiliation fields
Search the ROR Registry to find your institution’s ID
Benefits of Including People and Organization Identifiers¶
For researchers:
All your work is connected, regardless of name changes or institutional moves
Easier to prove your contributions for promotion, tenure, or grant applications
Automatic collection of citations and reuse
Proper disambiguation from others with similar names
For institutions:
Track all research outputs from the institution
Demonstrate research impact and productivity
Support reporting requirements
Identify collaboration networks
For the research community:
Discover all work by a researcher or institution
Understand collaboration patterns
Track research mobility
Measure impacts more accurately
Building a Fully Connected Scholarly Record¶
When you combine all these types of connection metadata, you create a rich, queryable network:
A dataset (DataCite DOI) can be connected to:
The researchers who created it (ORCIDs)
Their institutions (ROR IDs)
The grants that funded it (Funder IDs + award numbers)
The paper describing it (Crossref DOI)
The code that analyzes it (DataCite DOI)
The protocol used to collect it (DataCite DOI)
Previous and new versions of itself (version relationships)
This interconnected graph enables:
Comprehensive discovery (find all related materials)
Complete attribution (credit all contributors)
Impact measurement (trace outputs from funding to reuse)
Reproducibility (access all components needed)
Trust (transparent provenance and connections)
For more on the infrastructure enabling these connections, see Persistent Identifiers.
- Cousijn, H., & Melloni, L. (2021). Implementing FAIR Workflows: A Proof of Concept Study in the Field of Consciousness. DataCite. 10.60581/ZAEV-6P15