Skip to contents

Dive into data quality with squba!

This suite of R packages will allow you to investigate multiple facets of data quality and customize analyses based on your study-specific needs. Each module will allow you to conduct up to 8 different analyses in either the OMOP or PCORnet CDM, all aimed at taking a different view of the data while still addressing the same data quality probe. To learn more about the theory behind squba, see our manuscript (coming soon)!

This package will download all currently available modules, of which there are 11 (as of 11/2025). The “Modules” dropdown contains the list of each module and links out to the module-specific documentation. You can also see more granular descriptions of each module and check type on PEDSpace, our metadata repository.

Installation

Install the development version of the package:

devtools::install_github('ssdqa/squba')

If you would like to install individual modules, navigate to the appropriate repository for installation instructions.

Available Modules

  • Cohort Fitness
    • Patient Facts: Assesses the availability of patient clinical data per year of follow-up as a factor of visit type
    • Patient Event Sequencing: Evaluates the plausibility of the temporal relationship between two clinical events
    • Patient Record Consistency: Checks for consistency within a patient’s clinical record to ensure the information is confirmatory and complete
  • Variable Testing
  • Concept-Set Testing
  • Dataset Fitness
  • Cohort Identification
    • Cohort Attrition: Examine each step of a study’s attrition criteria to identify potential irregularities in cohort construction
    • Sensitivity to Selection Criteria: Compare demographics, utilization patterns, and clinical fact makeup of a base cohort definition to alternate cohort definitions