Human Participant Data Considerations | Data Management and Sharing | Washington University in St. Louis
Skip to content
Skip to search
Skip to footer
Data provided by human participants are subject to a variety of regulations and policies. Several offices at Washington University have oversight in human participant research. Below is a list of considerations for human participant data related to data management and sharing and resources for each consideration. All human participant research is subject to review and approval by the Institutional Review Board (IRB), which works closely with the Human Research Protection Office (HRPO).
On this page:
Informed Consent Forms and DMS Plans
HIPAA
De-Identification
Restricted Access Data Sharing
Data Use Agreements
NIH Guidance
Informed Consent Forms
If a project will collect data from human participants, informed consent must be obtained from each person who volunteers to provide data for the study. It would be rare that the IRB would approve sharing data from human participants that did not provide consent. A written Informed Consent Form (ICF) must be developed and approved by the Washington University IRB or other IRB (i.e. another institution’s IRB for multi-site study with single IRB, a commercial IRB if working with industry collaborators). The ICF should include a description of data management and sharing that is consistent with the Data Management and Sharing Plan (DMSP). At the time of IRB submission, the consent form may be evaluated for consistency with the DMSP depending on the IRB’s procedures.
Informed Consent Forms and DMS Plans
It is important to ensure the language in the ICF is consistent with the DMS plan. It is not necessary to include all details of the DMSP in the consent form but rather a summary to describe how the data will be collected, used and shared. If the consent language is not consistent with the DMSP, participants may need to be re-consented with modified language. For example, if the DMSP states that data will be shared publicly in a data repository but the ICF states, “data will be shared with other researchers at Washington University,” the data will not be able to be shared publicly in a repository unless participants are re-consented. In cases of inconsistency between the DMSP and the consent form, it may be necessary to work with the funding agency to modify the DMSP and get an approval on the revised DMSP. Otherwise, if participants are not re-consented and data are not shared, this would be considered noncompliance with an approved DMSP and potentially lead to funds withheld or loss of future funding opportunities.
Which to write first? ICF or DMS Plan?
In the majority of cases, the DMSP will be created prior to the IRB approval of the ICF. The IRB does not review projects at the time of grant submission. Upon a Just-In-Time notice, an IRB submission with the ICF will be accepted. The DMSP should be included in the IRB submission.
When the DMSP is prepared prior to IRB approval of the ICF and you have questions about consent language, it is strongly recommended that you consult with HRPO/IRB. Especially if this is the first time you will submit a DMSP to a funder. The feasibility of the DMSP with respect to IRB approval and consent should be considered. It is recommended that you consult with HRPO/IRB prior to submitting your DMSP if you propose to:
Provide open access to the data.
Share data that was collected without consent.
Share data with identifiers.
An ICF may be prepared first if pilot data are needed for a grant submission. In this scenario, a protocol and ICF would be prepared and submitted to the IRB for approval followed by enrolling participants and collecting data in order to use the data in the grant. The ICF under which the data was collected should be taken into consideration when developing the DMS plan.
WashU HRPO and IRB HELP Services
WashU HRPO and the IRB offer
HRPO Help Services
to provide guidance on language included in the ICF related to human participant data. Please review the
WashU IRB policy document
and contact them if you have further questions using the links below.
SWAT On-Call Service: 314-747-6800
Virtual Office Hours
IRB Consultation Request Form
Protocol and Consent Form for Data Sharing
The Washington University
One Protocol One Consent
is a standardized genomic protocol and consent that incorporates best-practices language for data sharing and permits the linkage of a participant or patient genetic data to the research copy of their electronic health record. This protocol and consent form can be used as an addition to an investigator’s clinical or research projects.
Institutional Certifications
Investigators working with large-scale human genomic data are required to submit an Institutional Certification to NIH. Learn about this important document and how to prepare it.
About Institutional Certifications
HIPAA
Any researcher collecting data from human participants is required to complete
HIPAA training
outlined by the
Washington University HIPAA Privacy Office
. Following training, researchers must follow the
Policies and Procedures
created by the HIPAA Privacy Office in order to maintain research participant privacy, security and rights. Researchers must familiarize themselves with the
18 HIPAA identifiers
and how to prevent the disclosure of these identifiers.
De-Identification
Human participant data must be properly de-identified prior to sharing to ensure individual participant privacy. De-identification should be sufficient so individuals in the dataset are not at risk of being re-identified, even preventing individuals from self-identifying. It also is important to consider the population under study. For example, the population size for individuals with a rare disease could be so small that the data would be considered identifiable regardless of what the identifiers or other information is removed. Imaging data may also be challenging to de-identify. In these cases, data sharing may not be possible if data cannot be fully de-identified. Contact the IRB using the resources described above when creating your DMSP if you are unsure if your data can be fully de-identified and shared.
Removing 18 HIPAA Identifiers
The most basic step for de-identification is removing the
18 HIPAA identifiers
from a dataset. Ideally, stripping the dataset of the 18 HIPAA identifiers is straightforward, however, complete removal of certain data can cause loss of data utility. For certain types of data, there are techniques that allow for decreased risk of identification without completely removing the data. For example, dates such as clinic or lab visits, follow up survey completion, etc., and unique identifying number (i.e. MRN, record ID, study ID, patient ID, etc.) are included in the list of HIPAA identifiers.
Date Shifting
and
Record Hashing
are methods that reduce the risk of re-identification from dates and unique identifiers while maintaining the utility of the information.
Date Shifting
involves software that uses an algorithm to randomly shift dates by a value between 0 and 364 days. The duration of time between dates within the project is maintained (e.g. the amount of time between a baseline and follow up visit), however, the actual dates that the events occurred are removed.
Record Hashing
involves software converting unique identifiers, such as a record ID, to an unrecognizable value. This allows for datasets to be shared with a record ID that is different and unrecognizable from the record ID used internally by the study team. It is important that a Record ID “Crosswalk File” is created at the time of record hashing. The Crosswalk File links the original record ID to the newly created hashed value. This file should be stored by the study team in a secure location that is separate from any study data.
Depending on the nature of the data, removing the 18 identifiers may be sufficient for de-identification. However, additional steps are often required. In addition to the 18 HIPAA identifiers, datasets can contain
indirect
or
quasi-identifiers
Indirect
and
quasi-identifiers
are information that can be combined together or with external information (i.e. government databases, social media profiles) to re-identify an individual. Examples include location, salary, occupation, race, ethnicity, disease status, veteran status, pregnancy status, or any values that are outliers related to the rest of the study sample or population (e.g. age greater than 90 is considered an identifier and should be reported as a range). It is important to consider the population under study. For example, the population size for individuals with a rare disease could be so small that the data would be considered identifiable regardless of what the identifiers or other information is removed. Imaging data may also be challenging to de-identify
REDCap De-Identification
REDCap offers built-in features for removing identifiers, date shifting and record hashing. To learn more, open the
REDCap De-Identification Tutorial
and watch the REDCap in a Flash webinar
Preparing De-Identified Data Exports
Additional De-Identification Resources
SAS Based Approach to De-Identification
Jack Shostak, Duke Clinical Research Institute (DCRI), Durham, NC
(Not sure how to cite this but the source code included is incredibly useful)
Johns Hopkins Resource for De-identification
Data Curation Network Data Primers
Consent Form Primer
Human Participant Data Essentials Primer
National Institute of Standards and Technology Tools for de-identification
Department of Health and Human Services Guidance
Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule
Restricted Access Data Sharing
Another approach to protecting patient privacy is sharing data in a repository that allows restricted access.
Restricted Access data sharing
involves submitting a dataset to a repository but the dataset is not available for public use. Potential users are prompted to complete a form with information about their credentials and proposed use of the data prior to receiving access to the data. Once receiving access to the data, the user is subject to either a licensing agreement or potentially must enter a formal data use agreement.
Data Use Agreements
Data Use Agreements
are formal contracts between a person who generated data and a person re-using data that have strict language regarding how data can be used, preserved, and destroyed.
Joint Research Office for Contracts (JROC)
Data Use Agreement Intake Form
NIH Guidance
Informed Consent
The NIH also released guidance on
Informed Consent for Secondary Research with Data and Biospecimens.
In addition, some Funding Opportunity Announcements (FOAs) or other grant awards state that awardees are required to submit data generated from the award to a domain specific repository. The FOA may list a single domain specific repository or provide a few repositories from which to choose from. For example, the NIH has several
domain specific repositories
. Some repositories provide guidance for developing informed consent language as well. Below are two examples:
NIMH Data Archive (NDA):
Crafting Informed Consent Language
OpenNeuro recommended:
Open Brain Consent Ultimate Consent Form
Protecting Privacy when Sharing Human Participant Data
To address the concerns about protecting privacy when sharing human research participant data, NIH released a notice (NOT-OD-22-213).
Supplemental Information to the NIH Policy for Data Management and Sharing: Protecting Privacy When Sharing Human Research Participant Data (NOT-OD-22-213)
Resources
Data Management and Sharing Policies
Data Management and Sharing Plans
Research Data Management (RDM)
Human Participant Data Considerations
Data Sharing
Data Access and Reuse
Loading Comments...
US