‘exactamente’ is an R package that offers a collection of tools to assist researchers and data analysts in exploring bootstrap methods on small sample size data. Bootstrap methods are widely used for estimating the sampling distribution of an estimator by resampling with replacement from the original sample.
This package is focused particularly on the exact bootstrap as described in Kisielinska (2013). This method is advantageous for bootstrapping small data sets where standard methods might be inadequate.
For a given sample size N, the exact bootstrap generates all N^N resamples, including permutations such as [5, 5, 3], [5, 3, 5], and [3, 5, 5] as distinct resamples.
Furthermore, ‘exactamente’ provides a standard bootstrap function where the user can specify the desired number of resamples, allowing for direct comparison between the exact method and conventional bootstrap techniques.
Here is the the step-by-step process that the ‘exactamente’ package functions use to perform their bootstrap resampling and compute the summary statistics and density estimates:
Compute the Resample Statistics: For each resample, the function computes a statistic of interest, such as the mean or the median. The function uses the user-specified ‘anon’ function for this computation. The ‘anon’ function defaults to the mean if not specified by the user.
Process Bootstrap Statistics: This is a common process for both methods. After generating the resample statistics, each function calls process_bootstrap_stats() to derive the summary statistics and density estimates.
Density Estimation: In process_bootstrap_stats(), it first calculates the kernel density estimate of the bootstrap statistics using the stats::density() function. If user-specified ‘density_args’ are provided, those are passed to the density function. The density estimate provides a smoothed representation of the distribution of the bootstrap statistics.
Summary Statistics: After generating the density estimate, process_bootstrap_stats() computes various summary statistics for the bootstrap statistics:
By providing these detailed outputs, the ‘exactamente’ package enables users to thoroughly investigate the characteristics of the bootstrap distribution and the behavior of the bootstrap estimator under different resampling methods.
You can install the released version of ‘exactamente’ from CRAN:
install.packages("exactamente")
You can install the development version of ‘exactamente’ from GitHub like so:
# install.packages("devtools")
::install_github("mightymetrika/exactamente") devtools
After installation, you can load the ‘exactamente’ package using the library function.
library(exactamente)
To utilize the fundamental tools provided by ‘exactamente’, begin by creating a numeric vector of data. You can then use the _bootstrap functions to obtain objects holding summary statistics and density estimates of the respective bootstrap distributions. In the following example, we use the [1, 2, 7] sample vector and take the mean as our test statistic.
<- c(1, 2, 7)
data
# Run exact bootstrap
<- exact_bootstrap(data)
e_res
# Run regular bootstrap
set.seed(183)
<- reg_bootstrap(data) r_res
Both _bootstrap functions comes with plot() and summary() methods for further investigation of the bootstrap results.
<- list(e_res, r_res)
res
lapply(res, plot)
#> [[1]]
#>
#> [[2]]
lapply(res, summary)
#> [[1]]
#> Method nres mode median mean sd lCI uCI
#> 1 exact_bootstrap 27 3.303687 3.333333 3.333333 1.54422 1.216667 5.916667
#>
#> [[2]]
#> Method nres mode median mean sd lCI uCI
#> 1 reg_bootstrap 10000 3.335788 3.333333 3.3368 1.518024 1 7
‘exactamente’ also includes the e_vs_r() function, which enables direct comparison between the two methods using the same data set.
set.seed(183)
<- e_vs_r(data)
comp_res $comp_plot comp_res
$summary_table
comp_res#> Method nres mode median mean sd lCI uCI
#> 1 exact_bootstrap 27 3.303687 3.333333 3.333333 1.544220 1.216667 5.916667
#> 2 reg_bootstrap 10000 3.335788 3.333333 3.336800 1.518024 1.000000 7.000000
One of the exciting features of the ‘exactamente’ package is the inclusion of an interactive Shiny app. The app allows you to visually explore and compare the bootstrap methods. It is designed with a user-friendly interface and offers real-time results visualization.
You can access this app with the following command:
exactamente_app()
Kisielinska, J. (2013). The exact bootstrap method shown on the example of the mean and variance estimation. Computational Statistics, 28, 1061–1077.