Using the tabular output generated by csd_process, this function will build a graph to
visualize the results. Each function configuration will output a bespoke ggplot. Theming can
be adjusted by the user after the graph has been output using + theme(). Most graphs can
also be made interactive using make_interactive_squba()
Usage
csd_output(
process_output,
concept_set = NULL,
vocab_tbl = NULL,
num_variables = 10,
num_mappings = 10,
filter_variable = NULL,
filter_concept = NULL,
text_wrapping_char = 80,
output_value = "prop_concept",
large_n = FALSE,
large_n_sites = NULL
)Arguments
- process_output
tabular input || required
The tabular output produced by
csd_process- concept_set
tabular input || optional
The concept set originally used in the
csd_processfunction. Recommended if no vocab_tbl is provided but the concept names are available in the concept set.- vocab_tbl
tabular input || optional
A vocabulary table containing concept names for the provided codes (ex: the OMOP concept table)
- num_variables
integer || defaults to
30An integer indicating the top N of variables to include for the exploratory analyses. The function will choose the most commonly occurring N variables to include in the plot.
- num_mappings
integer || defaults to
30An integer indicating the top N of concepts to include for the exploratory analyses. The function will choose the most commonly occurring N concepts per variable to include in the plot.
- filter_variable
string or vector || defaults to
NULLThe specific variable(s) to display in the output. This parameter is required for the following check types:
Single Site, Anomaly Detection, Cross-SectionalMulti Site, Anomaly Detection, Cross-SectionalSingle Site, Exploratory, LongitudinalSingle Site, Anomaly Detection, LongitudinalMulti Site, Exploratory, Longitudinal
- filter_concept
numeric/string or vector || defaults to
NULLThe specific code(s) to display in the output. This parameter is required for the following check types:
Single Site, Anomaly Detection, LongitudinalMulti Site, Exploratory, LongitudinalMulti Site, Anomaly Detection, Longitudinal
- text_wrapping_char
integer || defaults to
80An integer indicating the length limit for text on an axis before wrapping is enforced. This is only used for the
Multi Site, Anomaly Detection, Cross-Sectionalcheck type- output_value
string || defaults to
prop_conceptThe name of the numerical column in
process_outputthat should be used in the output. This parameter is required for the following check types:Multi-Site, Anomaly Detection, Cross-SectionalSingle Site, Exploratory, LongitudinalMulti-Site, Exploratory, Longitudinal
- large_n
boolean || defaults to
FALSEFor Multi-Site analyses, a boolean indicating whether the large N visualization, intended for a high volume of sites, should be used. This visualization will produce high level summaries across all sites, with an option to add specific site comparators via the
large_n_sitesparameter.- large_n_sites
vector || defaults to
NULLWhen
large_n = TRUE, a vector of site names that can add site-level information to the plot for comparison across the high level summary information.
Value
This function will produce a graph to visualize the results
from csd_process based on the parameters provided. The default
output is typically a static ggplot or gt object, but interactive
elements can be activated by passing the plot through make_interactive_squba.
For a more detailed description of output specific to each check type,
see the PEDSpace metadata repository
Examples
#' Source setup file
source(system.file('setup.R', package = 'conceptsetdistribution'))
#' Create in-memory RSQLite database using data in extdata directory
conn <- mk_testdb_omop()
#' Establish connection to database and generate internal configurations
initialize_dq_session(session_name = 'csd_process_test',
working_directory = my_directory,
db_conn = conn,
is_json = FALSE,
file_subdirectory = my_file_folder,
cdm_schema = NA)
#> Connected to: :memory:@NA
#' Build mock study cohort
cohort <- cdm_tbl('person') %>% dplyr::distinct(person_id) %>%
dplyr::mutate(start_date = as.Date(-5000),
#RSQLite does not store date objects,
#hence the numerics
end_date = as.Date(15000),
site = ifelse(person_id %in% c(1:6), 'synth1', 'synth2'))
#' Prepare input tables
csd_domain_tbl <- dplyr::tibble(domain = 'condition_occurrence',
concept_field = 'condition_concept_id',
date_field = 'condition_start_date',
vocabulary_field = NA)
csd_concept_tbl <- read_codeset('dx_hypertension') %>%
dplyr::mutate(domain = 'condition_occurrence',
variable = 'hypertension')
#' Execute `csd_process` function
#' This example will use the single site, exploratory, cross sectional
#' configuration
csd_process_example <- csd_process(cohort = cohort,
multi_or_single_site = 'single',
anomaly_or_exploratory = 'exploratory',
time = FALSE,
omop_or_pcornet = 'omop',
domain_tbl = csd_domain_tbl,
concept_set = csd_concept_tbl) %>%
suppressMessages()
#> ┌ Output Function Details ──────────────────────────────────────┐
#> │ You can optionally use this dataframe in the accompanying │
#> │ `csd_output` function. Here are the parameters you will need: │
#> │ │
#> │ Always Required: process_output │
#> │ Required for Check: num_variables, num_mappings │
#> │ Optional: concept_set, vocab_tbl │
#> │ │
#> │ See ?csd_output for more details. │
#> └───────────────────────────────────────────────────────────────┘
csd_process_example
#> # A tibble: 1 × 7
#> variable ct_denom concept_id ct_concept prop_concept site output_function
#> <chr> <int> <chr> <int> <dbl> <chr> <chr>
#> 1 hypertension 5 320128 5 1 comb… csd_ss_exp_cs
#' Execute `csd_output` function
csd_output_example <- csd_output(process_output = csd_process_example,
concept_set = csd_concept_tbl,
vocab_tbl = NULL) %>%
suppressMessages()
csd_output_example[[1]]
#' Easily convert the graph into an interactive ggiraph or plotly object with
#' `make_interactive_squba()`
make_interactive_squba(csd_output_example[[1]])