NIDDK DMS Tools & Resources
NIDDK has created the tools and examples below to assist investigators in developing their Data Management and Sharing (DMS) Plan.
In this section:
- DMS Plan Worksheet
- DMS Plan Examples
- Data and Metadata Standards
- Selecting a Data Repository
- NIDDK DMS Webinar Series
- Frequently Asked Questions (FAQ)
- Glossary of DMS Terms
NIDDK has outlined its expectations for DMS Plans. The DMS Plan Worksheet was developed to assist investigators in drafting a DMS Plan by including NIDDK-specific guidance for the National Institute of Health (NIH) DMS Plan optional format page (DOC, 35.4 KB) .
NIDDK DMS Plan Worksheet(DOC, 39 KB)
The example DMS Plans provided below are consistent with the expectations of NIDDK’s guidance and the NIH (NOT-OD-21-013 and NOT-OD-21-014). The example DMS Plans illustrate the required information and level of detail that should be included for common data types and research designs.
Researchers are encouraged to review the NIDDK-specific example plans below and adapt the concepts to their own research, rather than using the plans as a template.
-
Genomic Data from Human Research Participants
Example (PDF, 141.48 KB)
- Clinical Data from Human Research Participants Example (PDF, 151.85 KB)
-
Basic Research from a Non-Human Source
Example (PDF, 120.93 KB)
-
Secondary Data Analysis Example (PDF, 112.71 KB)
NIDDK held a webinar that provided an overview of successful approaches to Writing a DMS Plan.
After reviewing the above resources, investigators who have additional questions while drafting their DMS Plan should contact an NIDDK Program Officer.
Details about planned data and metadata should be provided in the DMS Plan.
The Data and Metadata Standards Examples for DMS Plans provides examples of data and metadata standards for common data types generated by NIDDK investigators. It also includes additional metadata for select repositories and guidance on how to incorporate data and metadata standards into developing your DMS Plan. These examples serve as a starting point and are not exhaustive.
Data and Metadata Standards Examples for DMS Plans(PDF, 190 KB)
Additional examples for incorporating metadata information in DMS plans for different data types can be found in NIDDK DMS Plan Examples and the Sample Plans provided by NIH.
Information on data and metadata standards was presented during the NIDDK DMS webinar, “Metadata and Data Standards for NIDDK Research Data ”. This webinar included:
- An overview of metadata and data standards;
- context for evaluating and finding appropriate and relevant research-specific needs; and
- the role of metadata and data standards in data comparison.
Repositories & Data and Metadata Standards
Repositories may have specific requirements for data or metadata standards. Many repositories have established metadata fields (required and optional) and may have templates for depositing data. Reviewing the requirements when developing the DMS Plan and prior to collecting data is critical to ensure that the required format, fields, and any preferred terminology are used to avoid issues that could prevent subsequent deposit and data sharing.
Using an appropriate data repository generally improves the FAIRness (Findability, Accessibility, Interoperability, and Reusability) of the data. Selection of an appropriate data repository is essential to maximize data sharing. NIDDK affirms the desired repository characteristics established by NIH, and strongly encourages the use of existing repositories to the extent possible for preserving and sharing scientific data.
Investigators need to consider the type of data they will be submitting when selecting a repository. A short justification of the repository selected for each data type must be included.
NIDDK strongly encourages investigators to consider the factors below in order when selecting a repository:
- NOFO specified repository (ies).
- Organism, domain, or data type-specific repositories.
- Whether controlled access to data is required (e.g., for protection of human subjects’ privacy).
Repository Selection Aides
- The Repository Selection Considerations Tool (NIDDK) (PDF, 293.45 KB) is intended to assist investigators to align the data types to be
generated with appropriate repositories for submission and sharing.
- dkNET lists available data repositories used by NIDDK supported researchers and provides tools to help comply with data sharing requirements.
- NIH Repositories for Sharing Scientific Data, NIH-supported repositories and generalist repositories for a wide range of data types and disciplines.
- The NIDDK DMS webinar “Finding
a Repository for Your Data” provided additional information about NIDDK-research relevant
repositories. The webinar videos linked below provides:
- an overview of the tools and resources available on the NIDDK Information Network (dkNET) through Dr. Grethe’s presentation .
- data eligibility and acceptance criteria as summarized in a presentation by Dr. Rodriguez on the NIDDK Central Repository.
- how generalist repositories can be leveraged when a domain or data-type specific repository is not available, as reviewed in the presentation by Mr. Chandramouliswaran .
NIDDK hosted a webinar series to provide education and outreach to the NIDDK scientific community about data management and sharing.
Writing a DMS Plan
This webinar provided an overview of DMS plan content and highlighted institutional, NIH, and NIDDK resources available to investigators throughout the research study life cycle. The role of research librarians as a resource to help in planning and execution of the DMS plan was emphasized.
Webinar Materials
Presentations
- Data Management and Sharing (DMS) Overview (Slides (PDF,
2.02 MB) )
Jeran Stratford, Ph.D., RTI International
Introduction to the DMS policy, DMS plan content, and NIDDK-specific guidance. Overview of resources available to NIDDK researchers writing a DMS plan.
- Help from a Research Data Management Librarian (Slides (PDF,
737.25 KB) )
Lisa Federer, Ph.D., National Library of Medicine
Overview of research librarian skills, training, and knowledge, and how research librarians are valuable institutional resources for investigators drafting DMS plans.
- The UCSF Data Science Initiative—An Institutional Hub for DMS Support (Slides (PDF,
932.04 KB) )
Ariel Deardorff, M.L.I.S., University of California, San Francisco (UCSF)
A practical example of how data librarians are adapting to support investigators and institutions implementing the new NIH DMS policy. An overview of the resources and services UCSF data librarians are providing to investigators were discussed.
Finding a Repository for Your Data
This webinar provided in-depth information on selecting appropriate repositories for scientific data. A brief overview of NIDDK DMS resources and tools for identifying repositories, as well as resources available through dkNET (NIDDK Information Network) were highlighted. Information about the NIDDK Central Repository and the Generalist Repository Ecosystems Initiative (GREI) rounded out the session.
Webinar Materials
Presentations
- Finding a Repository for NIDDK Study Data Sets (Slides (PDF,
1.18 MB) )
Jeran Stratford, Ph.D., RTI International
NIH criteria for selecting a data repository, different types of data repositories, and NIH and NIDDK resources for selecting an appropriate repository for research data were presented. Repository features to consider when choosing a repository, including access control and use of persistent identifiers, were also discussed.
- dkNET: Connecting Researchers to Resources (Slides (PDF,
1.12 MB) )
Jeffrey Grethe, Ph.D., University of California, San Diego
The dkNET suite of tools and resources available for investigators were discussed including a searchable list of DK-relevant repositories and a repository wizard in development.
- Submitting Resources to the NIDDK Central Repository (Slides (PDF,
2.25 MB) )
Rebecca Rodriguez, Ph.D., NIDDK
The NIDDK Central Repository (NIDDK-CR) is one potential repository for NIDDK research data. Policies related to eligibility for submitting data to the NIDDK-CR, as well as the types of data accepted were reviewed. Important features of the NIDDK-CR that make it a FAIR and trustworthy (according to the CoreTrustSeal parameters) resource were also covered.
- NIH Generalist Repository Ecosystem Initiative (GREI) (Slides (PDF,
1.32 MB) )
Ishwar Chandramouliswaran, M.S., M.B.A., NIH Office of Data Science Strategy
GREI is comprised of several generalist repositories that operate in important niches in the research data ecosystem and fill certain gaps and needs of investigators in making their data public. The presentation reviewed the GREI mission and how it is meeting the needs of data generators.
Metadata and Data Standards for NIDDK Research Data
This webinar reviewed importance and utility of metadata and data standards in maximizing scientific data value and intersections with the NIH DMS policy. The webinar covered use of metadata and data standards and lessons learned from NIDDK and NIH projects that use standards to enhance data quality and access.
Webinar Materials
- Meeting page
- Meeting summary (PDF, 204.43 KB)
- Webinar recording
Presentations
- Metadata and Data Standards - What and Why (Slides (PDF, 1.27 MB) )
Matthew Schu, Ph.D., RTI International
An introduction to metadata and data standards including examples from the Nutrition for Precision Health and All of Us programs.
- Establishing Data Structure to Increase Accessibility (Slides (PDF, 2.81 MB) )
M. Todd Valerius, Ph.D., Brigham and Women's Hospital, Harvard University
Case study using the ATLAS-D2K program on how complex data is brought together using data harmonization steps including standards and controlled terminologies to enable accessibility.
- Role of Data Standards in Quality and Harmonization (Slides (PDF, 4.26 MB) )
Sanjay Jain, Ph.D., Washington University in St. Louis
An example using the Kidney Precision Medicine Project on how data standards enable integration of assay data across many different study types and institutions including role of quality and standardization pipelines. Collaborative tools enabling discovery are mentioned.
- Implementation of Data and Metadata Standards – The Added Value (Slides (PDF, 2.16 MB) )
Kenneth Young, Ph.D., University of South Florida
The Environmental Determinants of Diabetes in the Young (TEDDY) study is a multi-center, multi-national effort that relies on data standards to enable sharing and insight. This presentation uses TEDDY as a case study to show how data standards and data dictionaries facilitate data sharing.
The ‘R’ in FAIR: Data Reuse
This webinar looked at secondary data use, also known as data reuse. Discussion focused on what data contributors can do to increase the reusability of their data and how secondary use of shared data can advance scientific knowledge. Examples of tools researchers can use at the generalist repository FigShare to find data were also presented.
Webinar Materials
- Meeting page
- Meeting summary (PDF, 166.47 KB)
- Webinar recording
Presentations
- Advancing knowledge through secondary data use (Slides (PDF, 4.45 MB) )
Vivian Ota Wang, Ph.D., NIH Office of Data Science Strategy
An overview of the role that open data and data sharing play in promoting innovation and scientific advancement, challenges to overcome in reusing data, and some of the ethical, economic, legal, and social implications of data reuse.
- Best practices for secondary data use (Slides (PDF, 3.34 MB) )
Harold Lehmann, MD, Ph.D., Johns Hopkins University
Reviewed some of the methodological considerations for secondary data use and the potential value of leveraging larger studies and cohorts. Discussion included the role of computational tools and resources in enabling this larger scaled research and in research communication.
- Generalist repositories for sharing and finding data (Slides (PDF, 2 MB) )
Ana Van Gulick, Ph.D., FigShare
Figshare is a generalist repository that enables the sharing of data in a way that supports FAIR principles. A review of how sharing practices help data reusability including findability is presented along with resources to aid investigators in developing their data management and sharing plans
The following FAQs are intended to help clarify certain considerations for the NIDDK-specific implementation of the 2023 NIH DMS Policy.
Managing and Sharing Scientific Data
How does the DMS policy intersect with the Genome Data Sharing Policy?
Genomic Data Sharing (GDS) requirements (NOT-OD-22-198) may apply simultaneously with the new DMS policy. Investigators should address GDS requirements within the DMS Plan submitted with the application. Additional information about general GDS policy is provided on the NIH Scientific Data Sharing FAQ on GDS.
What are the eligibility criteria for depositing data in repositories?
Eligibility criteria to deposit data vary by repository. While selecting an appropriate repository, investigators should contact the selected repository to confirm their eligibility to submit data and that their data type(s) are accepted.
Repository Selection Considerations Tool (NIDDK) (PDF, 293.45 KB)
Does a contract or proof of agreement with the repository need to be included in the DMS Plan?
No.
What do I need to consider if my proposed research project will involve Artificial Intelligence (AI) technologies?
NIH Office of Science Policy has issued a resource to assist the research community in understanding how NIH policies guide artificial intelligence (AI)-related research. The purpose of the resource is to illustrate the applicability of existing policies and guidance to research involving AI technologies. The resource can be accessed at: https://osp.od.nih.gov/policies/artificial-intelligence.
Updates to DMS Plans
Is it possible to update the DMS Plan during the course of the award?
Yes, investigators who need to make changes to any element(s) of the DMS Plan during the course of the award should work proactively with their Program Officer through their institutional official (or with the Scientific Director’s office for intramural investigators) to obtain review and approval of modifications when any changes or updates to the DMS Plan are needed. Examples that may require updates to the DMS Plan may include, but are not limited to, the following:
- If the type(s) of data generated change(s).
- A different data repository(ies) is(are) chosen for submission.
- The sharing timeline changes.
How often should DMS Plan progress be reported?
- Investigators should include an update on progress made towards fulfillment of the DMS activities during the progress report(s), including the final progress report.
- NIH will issue new DMS Research Performance Progress Report questions that align with the NIH Final Policy on Data Management and Sharing, to include updates on the status of data sharing, repositories and unique identifiers for data that have been shared. See NOT-OD-24-123 for up-to-date instructions and forms (when available).
The NIH FAQs provide additional information about the DMS policy agency-wide. Please check back regularly as new content will be added.
While not comprehensive, the glossary provides definitions for selected terms related to the 2023 DMS policy that might be unfamiliar or require content-specific definitions. Definitions for additional common data management terminology are available from the Digital Curation Centre and other academic or institution resources.
Term | Definition |
---|---|
Code | In the context of data management, this may include computer code or scripts used in the collection, manipulation, processing, analysis, or visualization of data but may also include software developed for other purposes. |
Controlled Access | Data that are made available under stringent, secure conditions. Typically, confidential or sensitive data. |
De-identified data | Health information that does not identify an individual and where there is no reasonable basis to establish that the information can be used to identify an individual. De-identification mitigates privacy risks to individuals, supporting the secondary use of data. |
FAIR Principles | Acronym for four key qualities of managing digital assets: Ensuring that they are "Findable, Accessible, Interoperable, and Reusable." Originally published in "The FAIR Guiding Principles For Scientific Data Management and Stewardship" in Scientific Data (2016). |
License | In the context of data management, a legal instrument that governs the terms of use of a data set. |
Machine-readable data | Structured data that can be easily processed by a computer. Making data machine readable often requires cleaning and preprocessing raw research data. |
Metadata | Documentation or information about a data set. It may be embedded in the data itself or exist separately from the data. Metadata may describe the ownership, purpose, methods, organization, and conditions for use of data, technical information about the data, and other information. Many metadata standards exist across a broad range of disciplines and applications. |
Open access | Freely available material that has few or no copyright or licensing restrictions |
Persistent Unique Identifier (PID) | A string of letters and numbers used to distinguish between and locate different objects, people, or concepts. Persistent identifiers support interoperability across different platforms and provide a reliable way to track citations and reuse. This identifier can be used to link one or more datasets belonging to the same study, which may be stored in multiple locations or repositories. Examples of PIDs include Digital Object Identifiers (DOIs), ORCID IDs, GUIDs, Handles, and Archival Resources Keys (ARKs). |
Repository |
A facility that manages the appraisal of, preservation of, and accessibility to materials
on a long-term or permanent basis. An Institutional Repository typically contains content produced by the institution that hosts the service. |
The information will be updated as additional policy or guidelines are established and as new resources are released.