The WCM Institutional Data Repository for Research (WIDRR) is a tool that was recently developed to help WCM researchers archive their datasets to be compliant with the new Cornell University Data Retention Policy. This new policy states that a data retention request should be formulated by researchers when they encounter one of three specific events. Researchers can archive their dataset either in an approved public repository or in WIDRR. In any case, researchers will need to report the location of their datasets in WIDRR.
Submitted by mawood on January 5, 2023 - 4:34pm
Selecting a Data Repository
A data repository is a place to archive research datasets and make them publicly available. To select an appropriate data repository, follow these steps:
Why Should I Share Data and Code
- It is required by some journal publishers (e.g., Nature), and funding agencies (e.g., the National Institutes of Health, National Science Foundation, etc.).
What do I need to do for compliance and institutional oversight?
NIH Compliance and Monitoring:
- You must document your compliance with your DMSP in your annual Research Performance Progress Report (RPPR). Non-compliance may result in NIH enforcement action such as:
- Addition of special terms or conditions to the award
- Termination of the award
- Non-compliance may also affect future funding decisions
- If you make changes to your current DMSP, your new plan must be approved by NIH, but the process varies depending on whether the change is made pre-award or post award.
Institutional Oversight:
- PIs will ultimately be responsible for ensuring the DMSP is executed
- The IRB will be responsible for ensuring that the sharing of data pertaining to human subjects is consistent between the DMSP and informed consent
- PIs will be responsible for ensuring Data Use Agreements are in place before sharing sensitive data
- Before sharing any data from the data core, data curators will ensure that the data have been de-identified, and will work with the PI and IRB to ensure that proper consents and permissions have been obtained to share the data
How do I share my data?
- Address the NIH’s goal of making data as accessible as possible. The NIH expect all sharable data to be made available, whether associated with a publication or not.
- All data used or generated as part of a grant must be managed, but not all data should be shared. You should not share data if doing so would violate privacy protections or applicable laws. If your data are not shareable, you must justify it when writing your DMSP.
- You may share human subjects-related data as long as your plan addresses how data sharing will be communicated in the consent process, and patients have given informed consent. See NIH sample consent language.
Before submitting your data to a repository, you will need to:
1. Bundle data together in logical groups for citation and reuse with assigned persistent identifiers (e.g., dataset DOIs)
2. De-identify your data, if appropriate
3. Convert your data to an open, machine-readable file format, such as .csv, when possible
4. Use data and metadata standards if appropriate to your field. Fairsharing.org is a database of such standards.
5. Document the dataset in a separate readme.txt file, and/or create metadata required by your chosen repository or discipline. Refer to the Data Documentation and Metadata Page for more.
When do I share my data?
The rule of thumb is: as soon as possible.
Consider relevant expectations such as data repository policies, record retention requirements, or journal policies.
NIH states that you must share your data when you publish your work or before your performance period ends, whichever comes first.
Where do I share my data?
You share via the same established data repositories in which you chose to deposit your data, such as:
What tools are available for compliance purposes during my grant award period?
Storage, Backups, Security:
Generalized Storage:
Specialized Storage:
To choose an appropriate repository we recommend the following steps:

This flowchart aims to guide investigators in decisions about their data retention and sharing duties for Cornell University and NIH policy compliance.

NIH has classified their repositories by funding agencies to help researchers locate the public repositories available under a specific funding Institute or Center. The link below shows lists of repositories that include the Institute or Center, Repository Name, Description, Submission Policy, and How to Access the Data. For guidance on the best repository for your data, contact the Wood Library.
NIH-recommended generalist respositories
The NIH has endorsed nine generalist repositories that house data regardless of type, format, content, or subject matter. The NIH recommended generalist repositories are available through this link: https://www.nlm.nih.gov/NIHbmic/generalist_repositories.html.
For guidance on the best repository for your data, contact the Wood Library.
Other data repositories
Other resources to help researchers find the right repositories can be found on the Samuel J. Wood Library Data Preservation, Access and Associated Timeframes site or the Arizona University website under the Tools for Finding Repository section.