When stratifying a cohort, it is generally desirable to calculate SMRs for different levels of a strata (such as a time-dependent exposure).
LTASR provides options to stratify a cohort by a fixed strata defined within the person file, or by a time-dependent exposure variable with information found in a separate history file.
For example, below will strata the example person and history file,
included in LTASR, by a cumulative exposure variable
exposure_level
:
#Define exposure cutpoints
exp <- exp_strata(var = 'exposure_level',
cutpt = c(-Inf, 0, 10000, 20000, Inf),
lag = 10)
#Read in and format person file
person <- person_example %>%
mutate(dob = as.Date(dob, format='%m/%d/%Y'),
pybegin = as.Date(pybegin, format='%m/%d/%Y'),
dlo = as.Date(dlo, format='%m/%d/%Y'))
#Read in and format history file
history <- history_example %>%
mutate(begin_dt = as.Date(begin_dt, format='%m/%d/%Y'),
end_dt = as.Date(end_dt, format='%m/%d/%Y'))
#Stratify cohort
py_table <- get_table_history(persondf = person,
rateobj = us_119ucod_recent,
historydf = history,
exps = list(exp))
This creates the following table (top 6 rows):
ageCat | CPCat | gender | race | exposure_levelCat | pdays | _o55 | _o52 |
---|---|---|---|---|---|---|---|
[15,20) | [1970,1975) | F | W | (-Inf,0] | 746 | 1 | 0 |
[25,30) | [1970,1975) | M | N | (-Inf,0] | 55 | 0 | 0 |
[25,30) | [1970,1975) | M | W | (-Inf,0] | 1472 | 0 | 0 |
[25,30) | [1975,1980) | M | W | (-Inf,0] | 323 | 0 | 0 |
[30,35) | [1970,1975) | M | N | (-Inf,0] | 1023 | 0 | 0 |
[30,35) | [1975,1980) | M | N | (-Inf,0] | 803 | 0 | 0 |
smr_minor
and smr_major
will calculate SMRs
for the entire cohort that is read in.
To calculate SMRs separately for each strata of
exposure_levelCat
, one option would be to create separate
person-year tables for each level:
#Subset py_table to the highest exposed group
py_table_high <- py_table %>%
filter(exposure_levelCat == '(2e+04, Inf]')
smr_minor_table_high <- smr_minor(py_table_high, us_119ucod_recent)
smr_major_table_high <- smr_major(smr_minor_table_high, us_119ucod_recent)
minor | Description | observed | expected | smr | lower | upper |
---|---|---|---|---|---|---|
52 | Other diseases of the nervous system and sense org | 0 | 0.01 | 0 | 0 | 368.89 |
55 | Ischemic heart disease | 0 | 0.06 | 0 | 0 | 61.48 |
major | Description | observed | expected | smr | lower | upper |
---|---|---|---|---|---|---|
16 | Diseases of the heart (Major) | 0 | 0.06 | 0 | 0 | 61.48 |
These results can be saved through repeated calls to
write_csv()
. This can be tedious for strata with many
levels.
Alternatively, the below code will loop through each level of the a
variable (defined by var
) and outputs results into an excel
file (using the writexl
library) with a separate tab for
each strata level:
#Define the name of the person year table (py_table)
#and the variable to calcualte SMRs accross
pyt <- py_table
var <- 'exposure_levelCat'
#Loop through levels of the above variable
lvls <- unique(pyt[var][[1]])
smr_minors <-
map(lvls,
~ {
pyt %>%
filter(!!sym(var) == .x) %>%
smr_minor(us_119ucod_recent)
}) %>%
setNames(lvls)
smr_majors <-
map(smr_minors,
~ smr_major(., us_119ucod_recent))%>%
setNames(names(smr_minors))
#Adjust names of sheets
names(smr_minors) <- str_replace_all(names(smr_minors), "\\[|\\]", "_")
names(smr_majors) <- str_replace_all(names(smr_majors), "\\[|\\]", "_")
#Save results
library(writexl)
write_xlsx(smr_minors, 'C:/SMR_Minors_by_exp.xlsx')
write_xlsx(smr_majors, 'C:/SMR_Majors_by_exp.xlsx')