REDCap Data Management

Plan your Data

Formally map every piece of data you collect to your analytic plan and/or reporting requirements. Ensure that the data collection will supply all required data.

Consider your data in terms of the number of variables you are trying to capture. Try to collect as much data as you need to prove or disprove your hypothesis. Be aware of common data collection mistakes:

  • Collecting more data than necessary is tedious to sift through and can lack meaningful insight.
  • Managing too many data elements may make it easier to overlook critical errors in the data.

Recommendation: Consult with a statistical consultant well before you implement your data collection plan. Statistical consultants can provide a “fresh set of eyes,” identifying problems that may have been overlooked by the Primary Investigator. Additionally, statistical consultants can propose alternative approaches that can vastly improve the power and quality of your analysis.

Describe the Input Data

Always use the Field Label option to describe the data you intend to capture in a data capture field. Field Labels ensure that end-users understand the data and the units of measurement, as well as the format of each field. Use Field Notes mainly to supplement the Field Label. For example, Field Notes can indicate the format of a validated data collection field or the  units of measurement to the end-user.

Keep a Code Book

Create a code book that describes each variable by name according to the following criteria:

  • type of data – numeric, date/time, character
  • units of measurement – grams, feet, micro-grams per deciliter
  • purpose of collecting the data and its relationship to other data

Tip: Use the REDCap data dictionary as a starting point for your code book.

Use the REDCap Identifiers function

Refer to “Storing Personally Identifying Information in REDCap Projects” in the University of Toronto REDCap Storage Best Practices companion document for guidance on using REDCap’s identifiers to store personally identifying information in the REDCap database.

Avoid Numbering Fields

Do not use integers to number texts within Field Labels. For example, avoid:
“1. When did you receive your diagnosis?”

Manually numbering fields will conflict with instances of branching logic. Additionally, moving or deleting fields will disrupt the numbering sequence, which will ruin the format of your survey.

Use Branching Logic to check data

  • Consider using branching logic to reduce the amount of time spent verifying missing data.
  • Only use branching logic if you have a thorough understanding of its functionality.

Data Management Plans

A data management plan or DMP is a formal research-oriented document that provides a layout of the method that the data will be managed before, during, and after a research project is completed.

A current global trend is the requirement of Data Management Plans (DMP) to accompany a researcher’s funding application by government funding agencies. A DMP will organize the data collected within your research project and proactively address the following questions:

  • What type of data will you collect, create, link to, acquire and/or record?
  • What file formats will your data be collected in? Will these formats allows for data reuse, sharing and long-term access to the data?
  • What conventions and procedures will you use to structure, name and version-control your files to help you and other better understand how your data are organized?
  • What documentation will be needed for the data to be read and interpreted correctly in the future?
  • How will you make sure that documentation is created or captured consistently throughout your project?
  • If you are using a metadata standard and/or tools to document and describe your data, please provide a list.
  • How and where will your data be stored and backed up during your research project?
  • Where will you deposit your data for long-term preservation and access at the end of your research project?
  • What data will you be sharing and in what form? (E.g. Raw, Processed, analyzed, final).
  • How will responsibilities for managing data activities be handled if substantive changes happen in the personnel overseeing the project’s data, including a change of principal investigator?
  • If your research project includes sensitive data, how will you ensure that it is securely managed and accessible only to approved members of the project?

Portage Data Management Plans

Portage is a national, library-based research data management network that coalesces initiatives in research data management to build capacity and to coordinate activities. The aim of Portage is to coordinate and expand existing expertise, services, and infrastructure so that all academic researchers in Canada have access to the support they need for research data management.

Portage is centered around two major components:

  • Network of Expertise:
    • Research Data Management (RDM) requires specialized knowledge and expertise, which is often missing within institutions
  • Infrastructure Platforms:
    • Portage is working to connect the various infrastructure and service components needed for data management planning and a national preservation and discovery network

Portage includes a data management plan assistant , a bilingual tool for preparing data management plans (DMPs). The tool follows best practices in data stewardship and walks researchers step-by-step through key questions about their data management. You can access the website and make your own account here:  . A template of data management plan from Portage can be found here: Data Management Plan Template

University of Toronto Resources for Best Practices in Data Management for Researchers

Refer to the following University of Toronto resources for non-REDCap specific best practices related to data management:

Best Practice For Details, See…
Create documentation and metadata for your datasets.
Use file formats that ensure long-term access.
Use descriptive file names.
Handle sensitive data appropriately.