exactamente

‘exactamente’ is an R package that offers a collection of tools to assist researchers and data analysts in exploring bootstrap methods on small sample size data. Bootstrap methods are widely used for estimating the sampling distribution of an estimator by resampling with replacement from the original sample.

This package is focused particularly on the exact bootstrap as described in Kisielinska (2013). This method is advantageous for bootstrapping small data sets where standard methods might be inadequate.

For a given sample size N, the exact bootstrap generates all N^N resamples, including permutations such as [5, 5, 3], [5, 3, 5], and [3, 5, 5] as distinct resamples.

Furthermore, ‘exactamente’ provides a standard bootstrap function where the user can specify the desired number of resamples, allowing for direct comparison between the exact method and conventional bootstrap techniques.

Here is the the step-by-step process that the ‘exactamente’ package functions use to perform their bootstrap resampling and compute the summary statistics and density estimates:

  1. Bootstrap Resampling: Each bootstrap method starts by generating a collection of bootstrap samples, also known as resamples.
  1. Compute the Resample Statistics: For each resample, the function computes a statistic of interest, such as the mean or the median. The function uses the user-specified ‘anon’ function for this computation. The ‘anon’ function defaults to the mean if not specified by the user.

  2. Process Bootstrap Statistics: This is a common process for both methods. After generating the resample statistics, each function calls process_bootstrap_stats() to derive the summary statistics and density estimates.

  3. Density Estimation: In process_bootstrap_stats(), it first calculates the kernel density estimate of the bootstrap statistics using the stats::density() function. If user-specified ‘density_args’ are provided, those are passed to the density function. The density estimate provides a smoothed representation of the distribution of the bootstrap statistics.

  4. Summary Statistics: After generating the density estimate, process_bootstrap_stats() computes various summary statistics for the bootstrap statistics:

  1. Return Result: Lastly, process_bootstrap_stats() returns a list containing the density estimate and the summary statistics. The bootstrap function then assigns a specific class to the result (‘extboot’, or ‘regboot’) and returns the result to the user.

By providing these detailed outputs, the ‘exactamente’ package enables users to thoroughly investigate the characteristics of the bootstrap distribution and the behavior of the bootstrap estimator under different resampling methods.

Installation

You can install the released version of ‘exactamente’ from CRAN:

install.packages("exactamente")

You can install the development version of ‘exactamente’ from GitHub like so:

# install.packages("devtools")
devtools::install_github("mightymetrika/exactamente")

Example Usage

After installation, you can load the ‘exactamente’ package using the library function.

library(exactamente)

To utilize the fundamental tools provided by ‘exactamente’, begin by creating a numeric vector of data. You can then use the _bootstrap functions to obtain objects holding summary statistics and density estimates of the respective bootstrap distributions. In the following example, we use the [1, 2, 7] sample vector and take the mean as our test statistic.

data <- c(1, 2, 7)

# Run exact bootstrap
e_res <- exact_bootstrap(data)

# Run regular bootstrap
set.seed(183)
r_res <- reg_bootstrap(data)

Both _bootstrap functions comes with plot() and summary() methods for further investigation of the bootstrap results.

res <- list(e_res, r_res)

lapply(res, plot)
#> [[1]]

#> 
#> [[2]]

lapply(res, summary)
#> [[1]]
#>            Method nres     mode   median     mean      sd      lCI      uCI
#> 1 exact_bootstrap   27 3.303687 3.333333 3.333333 1.54422 1.216667 5.916667
#> 
#> [[2]]
#>          Method  nres     mode   median   mean       sd lCI uCI
#> 1 reg_bootstrap 10000 3.335788 3.333333 3.3368 1.518024   1   7

‘exactamente’ also includes the e_vs_r() function, which enables direct comparison between the two methods using the same data set.

set.seed(183)
comp_res <- e_vs_r(data)
comp_res$comp_plot

comp_res$summary_table
#>            Method  nres     mode   median     mean       sd      lCI      uCI
#> 1 exact_bootstrap    27 3.303687 3.333333 3.333333 1.544220 1.216667 5.916667
#> 2   reg_bootstrap 10000 3.335788 3.333333 3.336800 1.518024 1.000000 7.000000

Interactive Shiny App

One of the exciting features of the ‘exactamente’ package is the inclusion of an interactive Shiny app. The app allows you to visually explore and compare the bootstrap methods. It is designed with a user-friendly interface and offers real-time results visualization.

You can access this app with the following command:

exactamente_app()

References

Kisielinska, J. (2013). The exact bootstrap method shown on the example of the mean and variance estimation. Computational Statistics, 28, 1061–1077.