This example uses the National Study of Long-Term Care Providers (NSLTCP) Residential Care Community (RCC) Services User (SU) 2018 Public Use File (PUF) to replicate the estimates from a report called Residential Care Community Resident Characteristics: United States, 2018. “The survey used a sample of residential care community residents, obtained from a frame that was constructed from lists of licensed residential care communities acquired from the state licensing agencies in each of the 50 states and the District of Columbia.”
The RCC SU 2018 survey comes with the surveytable
package, for use in examples, in an object called
rccsu2018
.
Begin by loading the surveytable
package. When you do,
it will print a message explaining how to specify the survey that you’d
like to analyze.
Now, specify the survey that you’d like to analyze.
Variables | Observations | Design |
---|---|---|
81 | 904 |
Stratified Independent Sampling design svydesign(ids = ~1, strata = ~pufstrata2 + su_facid, fpc = ~pufpopfac2, weights = ~suwt, data = d1) |
Check the survey name, survey design variables, and the number of observations to verify that it all looks correct.
For this example, we do want to turn on certain NCHS-specific options, such as identifying low-precision estimates. If you do not care about identifying low-precision estimates, you can skip this command. To turn on the NCHS-specific options:
Alternatively, you can combine these two commands into a single command, like so:
Variables | Observations | Design |
---|---|---|
81 | 904 |
Stratified Independent Sampling design svydesign(ids = ~1, strata = ~pufstrata2 + su_facid, fpc = ~pufpopfac2, weights = ~suwt, data = d1) |
This figure shows the percentage of residents by sex, race / ethnicity, and age group.
Sex.
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
Male | 272 | 299 | 24 | 255 | 352 | 32.6 | 2.5 | 27.7 | 37.7 |
Female | 632 | 619 | 26 | 570 | 673 | 67.4 | 2.5 | 62.3 | 72.3 |
N = 904. Checked NCHS presentation standards. Nothing to report. |
Race / ethnicity.
Variable | Class | Long name |
---|---|---|
raceeth2 | factor | Resident’s race/ethnicity |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL | Flags |
---|---|---|---|---|---|---|---|---|---|---|
White | 816 | 821 | 23 | 776 | 868 | 89.3 | 1.8 | 85.4 | 92.6 | |
Black | 40 | 54 | 14 | 31 | 95 | 5.9 | 1.5 | 3.3 | 9.6 | |
Hispanic | 23 | 18 | 5 | 9 | 34 | 1.9 | 0.6 | 1 | 3.4 | |
Other | 25 | 26 | 8 | 12 | 55 | 2.8 | 0.9 | 1.3 | 5.3 | Cx |
N = 904. Checked NCHS presentation standards: Cx: suppress count (and rate). |
In the published figure, the Hispanic and Other categories have been
merged into a single category called “Another race or ethnicity”. We can
do that using the var_collapse()
function.
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
White | 816 | 821 | 23 | 776 | 868 | 89.3 | 1.8 | 85.4 | 92.6 |
Black | 40 | 54 | 14 | 31 | 95 | 5.9 | 1.5 | 3.3 | 9.6 |
Another race or ethnicity | 48 | 44 | 10 | 27 | 70 | 4.8 | 1.1 | 2.9 | 7.3 |
N = 904. Checked NCHS presentation standards. Nothing to report. |
Age group.
Variable | Class | Long name |
---|---|---|
age2 | numeric | Resident’s age |
age2
is a numeric variable. We need to create a
categorical variable based on this numeric variable. This is done using
the var_cut()
function.
var_cut("Age", "age2"
, c(-Inf, 64, 74, 84, Inf)
, c("Under 65", "65-74", "75-84", "85 and over") )
tab("Age")
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
Under 65 | 75 | 69 | 11 | 49 | 96 | 7.5 | 1.2 | 5.2 | 10.3 |
65-74 | 98 | 111 | 17 | 82 | 151 | 12.1 | 1.8 | 8.8 | 16.1 |
75-84 | 221 | 235 | 22 | 195 | 282 | 25.5 | 2.2 | 21.3 | 30.2 |
85 and over | 510 | 504 | 26 | 456 | 557 | 54.9 | 2.6 | 49.7 | 60 |
N = 904. Checked NCHS presentation standards. Nothing to report. |
This figure shows the percentage of residents with Medicaid, overall and by age group.
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 674 | 674 | 24 | 628 | 723 | 73.3 | 2.1 | 68.9 | 77.4 |
TRUE | 143 | 160 | 18 | 128 | 201 | 17.5 | 1.9 | 13.9 | 21.5 |
<N/A> | 87 | 85 | 11 | 64 | 111 | 9.2 | 1.2 | 6.9 | 11.9 |
N = 904. Checked NCHS presentation standards. Nothing to report. |
As we can see, for some observations, the value of this variable is
unknown (it’s missing or NA
). The above command calculates
percentages based on all observations, including the ones with missing
(NA
) values. However, in the published figure, the
percentages are based on the knowns only. To exclude the
NA
’s from the calculation, use the drop_na
argument:
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 674 | 674 | 24 | 628 | 723 | 80.8 | 2.1 | 76.4 | 84.7 |
TRUE | 143 | 160 | 18 | 128 | 201 | 19.2 | 2.1 | 15.3 | 23.6 |
N = 817. Checked NCHS presentation standards. Nothing to report. |
Note that the table title alerts you to the fact that you are using known values only.
By age group:
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL | Flags |
---|---|---|---|---|---|---|---|---|---|---|
FALSE | 31 | 30 | 8 | 17 | 56 | 49.8 | 9.5 | 30.3 | 69.4 | Px |
TRUE | 35 | 31 | 8 | 18 | 52 | 50.2 | 9.5 | 30.6 | 69.7 | Px |
N = 66. Checked NCHS presentation standards: Px: suppress percent. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL | Flags |
---|---|---|---|---|---|---|---|---|---|---|
FALSE | 53 | 55 | 11 | 37 | 83 | 62 | 7.6 | 45.5 | 76.7 | Px |
TRUE | 30 | 34 | 8 | 20 | 58 | 38 | 7.6 | 23.3 | 54.5 | Px |
N = 83. Checked NCHS presentation standards: Px: suppress percent. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 163 | 167 | 18 | 134 | 208 | 79.1 | 4.5 | 68.6 | 87.3 |
TRUE | 33 | 44 | 11 | 26 | 75 | 20.9 | 4.5 | 12.7 | 31.4 |
N = 196. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 427 | 421 | 23 | 378 | 469 | 89.1 | 2.2 | 83.8 | 93.1 |
TRUE | 45 | 52 | 11 | 33 | 81 | 10.9 | 2.2 | 6.9 | 16.2 |
N = 472. Checked NCHS presentation standards. Nothing to report. |
Note that according to the NCHS presentation criteria, some of the percentages should be suppressed.
(Figure 3 is slightly more involved, so we’ll do it next.)
Here’s a table for high blood pressure.
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 397 | 404 | 25 | 357 | 457 | 44 | 2.5 | 38.9 | 49.1 |
TRUE | 481 | 498 | 26 | 449 | 552 | 54.2 | 2.6 | 49 | 59.3 |
<N/A> | 26 | 17 | 4 | 10 | 28 | 1.8 | 0.4 | 1.1 | 2.9 |
N = 904. Checked NCHS presentation standards. Nothing to report. |
Once again, unknown values (NA
) are present, while the
figure is based on knowns only. Therefore, we again will use the
drop_na
argument:
tab("hbp", "alz", "depress", "arth", "diabetes", "heartdise", "osteo"
, "copd", "stroke", "cancer"
, drop_na = TRUE)
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 397 | 404 | 25 | 357 | 457 | 44.8 | 2.6 | 39.7 | 50 |
TRUE | 481 | 498 | 26 | 449 | 552 | 55.2 | 2.6 | 50 | 60.3 |
N = 878. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 538 | 598 | 26 | 549 | 651 | 66.3 | 2.1 | 62 | 70.5 |
TRUE | 340 | 304 | 19 | 268 | 344 | 33.7 | 2.1 | 29.5 | 38 |
N = 878. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 629 | 654 | 24 | 609 | 703 | 72.5 | 2.1 | 68.1 | 76.6 |
TRUE | 249 | 248 | 20 | 211 | 292 | 27.5 | 2.1 | 23.4 | 31.9 |
N = 878. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 683 | 717 | 26 | 668 | 770 | 79.5 | 2 | 75.3 | 83.3 |
TRUE | 195 | 185 | 18 | 152 | 224 | 20.5 | 2 | 16.7 | 24.7 |
N = 878. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 719 | 718 | 23 | 675 | 765 | 79.6 | 2.1 | 75.3 | 83.6 |
TRUE | 159 | 184 | 20 | 148 | 227 | 20.4 | 2.1 | 16.4 | 24.7 |
N = 878. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 739 | 746 | 25 | 697 | 798 | 82.7 | 1.8 | 78.7 | 86.2 |
TRUE | 139 | 156 | 17 | 126 | 193 | 17.3 | 1.8 | 13.8 | 21.3 |
N = 878. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 766 | 794 | 24 | 749 | 842 | 88 | 1.4 | 84.9 | 90.7 |
TRUE | 112 | 108 | 13 | 85 | 137 | 12 | 1.4 | 9.3 | 15.1 |
N = 878. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 779 | 806 | 24 | 759 | 856 | 89.4 | 1.6 | 85.9 | 92.3 |
TRUE | 99 | 96 | 14 | 71 | 129 | 10.6 | 1.6 | 7.7 | 14.1 |
N = 878. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 789 | 807 | 23 | 764 | 853 | 89.5 | 1.5 | 86.1 | 92.4 |
TRUE | 89 | 94 | 14 | 70 | 128 | 10.5 | 1.5 | 7.6 | 13.9 |
N = 878. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
FALSE | 806 | 824 | 23 | 780 | 871 | 91.4 | 1.6 | 87.7 | 94.2 |
TRUE | 72 | 78 | 14 | 53 | 114 | 8.6 | 1.6 | 5.8 | 12.3 |
N = 878. Checked NCHS presentation standards. Nothing to report. |
surveytable
provides a number of functions to create
or modify survey variables.
We saw a couple of these above: var_collapse()
and
var_cut()
.
Occasionally, you might need to do advanced variable editing. Here’s how:
Every survey object has an element called
variables
This is a data frame where the survey’s variables are located
variables
data frame
(which is part of the survey object).set_survey()
again. Any time you modify the
variables
data frame, call set_survey()
.We go through these steps to count how many chronic conditions were present.
rccsu2018$variables$num_cc = 0
for (vr in c("hbp", "alz", "depress", "arth", "diabetes", "heartdise", "osteo"
, "copd", "stroke", "cancer")) {
idx = which(rccsu2018$variables[,vr])
rccsu2018$variables$num_cc[idx] = rccsu2018$variables$num_cc[idx] + 1
}
set_survey(rccsu2018, mode = "NCHS")
## * Mode: NCHS.
Variables | Observations | Design |
---|---|---|
82 | 904 |
Stratified Independent Sampling design svydesign(ids = ~1, strata = ~pufstrata2 + su_facid, fpc = ~pufpopfac2, weights = ~suwt, data = d1) |
num_cc
is a numeric variable with the number of chronic
conditions. The published figure uses a categorical variable which is
based on this numeric variable. Use var_cut()
, which
converts numeric variables to categorical (factor
)
variables.
var_cut("Number of chronic conditions", "num_cc"
, c(-Inf, 0, 1, 3, 10, Inf)
, c("0", "1", "2-3", "4-10", "??"))
tab("Number of chronic conditions")
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
0 | 121 | 140 | 19 | 106 | 184 | 15.2 | 2 | 11.5 | 19.6 |
1 | 189 | 180 | 17 | 148 | 218 | 19.5 | 1.9 | 16 | 23.5 |
2-3 | 446 | 444 | 23 | 400 | 492 | 48.3 | 2.4 | 43.5 | 53.1 |
4-10 | 148 | 156 | 17 | 125 | 194 | 16.9 | 1.8 | 13.5 | 20.8 |
N = 904. Checked NCHS presentation standards. Nothing to report. |
Here’s a table for bathhlp
(help with bathing):
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL | Flags |
---|---|---|---|---|---|---|---|---|---|---|
MISSING | 22 | 10 | 2 | 6 | 17 | 1.1 | 0.3 | 0.6 | 2.1 | |
NEED HELP OR SUPERVISION FROM ANOTHER PERSON | 551 | 581 | 25 | 534 | 633 | 63.3 | 2.3 | 58.7 | 67.7 | |
USE OF AN ASSISTIVE DEVICE | 11 | 7 | 2 | 3 | 15 | 0.7 | 0.3 | 0.3 | 1.5 | Cx |
BOTH | 127 | 113 | 15 | 87 | 148 | 12.4 | 1.6 | 9.4 | 15.9 | |
NEED NO ASSISTANCE | 193 | 207 | 18 | 173 | 247 | 22.5 | 2 | 18.7 | 26.6 | |
N = 904. Checked NCHS presentation standards: Cx: suppress count (and rate). |
This variable has multiple levels.
"NEED NO ASSISTANCE"
) = does not need
help"MISSING"
) = unknownWe want to show (resident needing help) as a percentage of knowns only (that is, excluding the unknowns).
To do this, convert the variable to having 2 levels (needs help /
does not need help) plus NA
(for unknown); then use the
drop_na
argument to base percentages on knowns only.
for (vr in c("bathhlp", "walkhlp", "dreshlp", "transhlp", "toilhlp", "eathlp")) {
var_collapse(vr
, "Needs assistance"
, c("NEED HELP OR SUPERVISION FROM ANOTHER PERSON"
, "USE OF AN ASSISTIVE DEVICE"
, "BOTH"))
var_collapse(vr, NA, "MISSING")
}
tab("bathhlp", "walkhlp", "dreshlp", "transhlp", "toilhlp", "eathlp", drop_na = TRUE)
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
Needs assistance | 689 | 702 | 25 | 654 | 752 | 77.2 | 2 | 73.1 | 81.1 |
NEED NO ASSISTANCE | 193 | 207 | 18 | 173 | 247 | 22.8 | 2 | 18.9 | 26.9 |
N = 882. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
Needs assistance | 622 | 625 | 24 | 578 | 675 | 68.9 | 2.3 | 64.2 | 73.4 |
NEED NO ASSISTANCE | 253 | 281 | 22 | 241 | 329 | 31.1 | 2.3 | 26.6 | 35.8 |
N = 875. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
Needs assistance | 527 | 561 | 25 | 513 | 614 | 61.7 | 2.3 | 57.1 | 66.2 |
NEED NO ASSISTANCE | 355 | 348 | 22 | 308 | 393 | 38.3 | 2.3 | 33.8 | 42.9 |
N = 882. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
Needs assistance | 463 | 464 | 24 | 418 | 515 | 51 | 2.4 | 46.1 | 55.8 |
NEED NO ASSISTANCE | 420 | 446 | 24 | 400 | 496 | 49 | 2.4 | 44.2 | 53.9 |
N = 883. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
Needs assistance | 437 | 443 | 24 | 398 | 493 | 48.7 | 2.4 | 43.8 | 53.5 |
NEED NO ASSISTANCE | 447 | 467 | 25 | 421 | 518 | 51.3 | 2.4 | 46.5 | 56.2 |
N = 884. Checked NCHS presentation standards. Nothing to report. |
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
Needs assistance | 257 | 240 | 21 | 200 | 286 | 26.3 | 2.3 | 21.9 | 31.1 |
NEED NO ASSISTANCE | 628 | 671 | 26 | 622 | 724 | 73.7 | 2.3 | 68.9 | 78.1 |
N = 885. Checked NCHS presentation standards. Nothing to report. |
Now, go through the “advanced variable editing” steps – very similar to Figure 4 – to count how many ADLs were present.
rccsu2018$variables$num_adl = 0
for (vr in c("bathhlp", "walkhlp", "dreshlp", "transhlp", "toilhlp", "eathlp")) {
idx = which(rccsu2018$variables[,vr] %in%
c("NEED HELP OR SUPERVISION FROM ANOTHER PERSON"
, "USE OF AN ASSISTIVE DEVICE"
, "BOTH"))
rccsu2018$variables$num_adl[idx] = rccsu2018$variables$num_adl[idx] + 1
}
set_survey(rccsu2018, mode = "NCHS")
## * Mode: NCHS.
Variables | Observations | Design |
---|---|---|
83 | 904 |
Stratified Independent Sampling design svydesign(ids = ~1, strata = ~pufstrata2 + su_facid, fpc = ~pufpopfac2, weights = ~suwt, data = d1) |
For generating the figure, create a categorical variable based on
num_adl
, which is numeric.
var_cut("Number of ADLs", "num_adl"
, c(-Inf, 0, 2, 6, Inf)
, c("0", "1-2", "3-6", "??"))
tab("Number of ADLs")
Level | n | Number (000) | SE (000) | LL (000) | UL (000) | Percent | SE | LL | UL |
---|---|---|---|---|---|---|---|---|---|
0 | 131 | 114 | 12 | 92 | 142 | 12.4 | 1.3 | 9.9 | 15.4 |
1-2 | 218 | 249 | 22 | 209 | 297 | 27.1 | 2.2 | 22.8 | 31.8 |
3-6 | 555 | 555 | 25 | 508 | 606 | 60.4 | 2.4 | 55.6 | 65.1 |
N = 904. Checked NCHS presentation standards. Nothing to report. |