Multi-Site Analysis for Independent Data Sources
Source:vignettes/multisite_independent.Rmd
multisite_independent.RmdThe multi-site analyses included in this suite are intended to be executed against data that are all stored in the same place. However, there may be some instances where the data associated with each site is stored in independent locations. This vignette outlines how the multi-site analysis can be executed in these instances.
Multi-Site Exploratory Analysis
First, execute the Single Site, Exploratory analyses, configured appropriately for your study, against each data source.
library(sensitivityselectioncriteria)
my_table <- ssc_process(base_cohort = my_cohort,
alt_cohorts = list('My Alternate Cohort' = my_alt_cohort),
omop_or_pcornet = 'omop',
multi_or_single_site = 'single',
anomaly_or_exploratory = 'exploratory',
...)This function will produce 2 tables. Select the
first table in the list from each result set, then
combine these results into a single table with the different sites
delineated in the site column. You will also need to edit
the output_function column to reflect that this table now
should be considered a Multi Site, Exploratory output.
Multi-Site Anomaly Detection Analysis
It is slightly more complex to reproduce the anomaly detection
analysis in this case, due to the level of summarization that is output
by the typical function. Instead of running the ssc_process
as you normally would, you will need to use an internal function to
produce the correct results.
The first step will be to run the appropriate internal function,
depending on which CDM you are using, against each of the data sources.
It intakes most of the same parameters as the primary
ssc_process function.
## For an OMOP CDM:
my_omop_table <-
sensitivityselectioncriteria:::compare_cohort_def_omop(
base_cohort = my_cohort,
alt_cohorts = list('My Alternate Cohort' = my_alt_cohort),
multi_or_single_site = 'single',
...)
## For a PCORnet CDM:
my_pcornet_table <-
sensitivityselectioncriteria:::compare_cohort_def_pcnt(
base_cohort = my_cohort,
alt_cohorts = list('My Alternate Cohort' = my_alt_cohort),
multi_or_single_site = 'single',
...)Once this function has been executed against each data source,
combine these results into a single table with the different sites
delineated in the site column.
Then, pass this combined table through the anomaly detection function
that will execute a standardized mean difference computation. If you
provided custom demographic mappings, extract the demographic labels
from this file. Otherwise, use the appropriate package-provided table
based on your data model (either ssc_omop_demographics or
ssc_pcornet_demographics).
You will also need to edit the output_function column to
reflect that this table now should be considered a Multi Site, Anomaly
Detection output.
my_final_results <-
sensitivityselectioncriteria:::compare_cohort_smd(cohort_def_output = my_combo_results,
demographic_vector =
my_demographic_table %>%
select(demographic) %>% pull()) %>%
dplyr::mutate(output_function = 'ssc_ms_anom_cs')