Best Practices for Data Sharing and Archiving
Why Should I Share Data and Code
- It is required by some journal publishers (e.g., Nature), and funding agencies (e.g., the National Institutes of Health, National Science Foundation, etc.).
What do I need to do for compliance and institutional oversight?
NIH Compliance and Monitoring:
Institutional Oversight:
How do I share my data?
Before submitting your data to a repository, you will need to:
1. Bundle data together in logical groups for citation and reuse with assigned persistent identifiers (e.g., dataset DOIs)
2. De-identify your data, if appropriate
3. Convert your data to an open, machine-readable file format, such as .csv, when possible
4. Use data and metadata standards if appropriate to your field. Fairsharing.org is a database of such standards.
5. Document the dataset in a separate readme.txt file, and/or create metadata required by your chosen repository or discipline. Refer to the Data Documentation and Metadata Page for more.
When do I share my data?
The rule of thumb is: as soon as possible.
Consider relevant expectations such as data repository policies, record retention requirements, or journal policies.
NIH states that you must share your data when you publish your work or before your performance period ends, whichever comes first.
Where do I share my data?
You share via the same established data repositories in which you chose to deposit your data, such as:
What tools are available for compliance purposes during my grant award period?
Storage, Backups, Security:
Generalized Storage:
Specialized Storage:
To choose an appropriate repository we recommend the following steps:
This flowchart aims to guide investigators in decisions about their data retention and sharing duties for Cornell University and NIH policy compliance.
NIH has classified their repositories by funding agencies to help researchers locate the public repositories available under a specific funding Institute or Center. The link below shows lists of repositories that include the Institute or Center, Repository Name, Description, Submission Policy, and How to Access the Data. For guidance on the best repository for your data, contact the Wood Library.
NIH-recommended generalist respositories
The NIH has endorsed nine generalist repositories that house data regardless of type, format, content, or subject matter. The NIH recommended generalist repositories are available through this link: https://www.nlm.nih.gov/NIHbmic/generalist_repositories.html.
For guidance on the best repository for your data, contact the Wood Library.
Other data repositories
Other resources to help researchers find the right repositories can be found on the Samuel J. Wood Library Data Preservation, Access and Associated Timeframes site or the Arizona University website under the Tools for Finding Repository section.
Who reviews the budget?
The Center for Scientific Review (CSR) will check DMSPs for completeness and viability. The Peer Review Committee (PRC) will assess the budget and the budget justification for feasibility. The PRC will not see the DMSP which will not impact the scoring.
More information on budgeting for data management and sharing can be found here.
Where are the costs represented?
The costs must be included in the SF 424 R&R budget form in Section F. Other direct costs or PHS 398 can be included for Modular Budgets. There will be a new Budget Line Item labeled “Data Management and Sharing.” The costs must also be included in Section L of Budget Justification.
What are the unallowable costs?
What are the allowable costs?
Allowable costs include any reasonable, justifiable costs required to comply with the DMSP.
Some examples are: