tldr; This package implements a different algorithm from the one implemented in base R, and it reduces the complexity of the Kendall’s correlation coefficient from O(n^2) to O(n log n) resulting in a runtime of nano seconds or minutes instead of minutes or hours. This package is written in C++ and uses cpp11 to export the functions to R. See the vignette for the mathematical details.
If this software is useful to you, please consider donating on Buy Me A Coffee. All donations
will be used to continue improving kendallknight
.
You can install the released version of kendallknight from CRAN with:
install.packages("kendallknight")
You can install the development version of kendallknight like so:
::install_github("pachadotdev/kendallknight") remotes
See the documentation: https://pacha.dev/kendallknight/.
We tested the kendallknight
package against the base R
implementation of the Kendall correlation using the cor
function with method = "kendall"
for randomly generated
vectors of different lengths. The results are shown in the following
table:
Number of observations | kendallknight median time (s) | base R median time (s) |
---|---|---|
10,000 | 0.003 | 1.251 |
20,000 | 0.010 | 5.313 |
30,000 | 0.011 | 11.002 |
40,000 | 0.014 | 19.578 |
50,000 | 0.017 | 30.509 |
60,000 | 0.021 | 43.670 |
70,000 | 0.024 | 61.310 |
80,000 | 0.029 | 77.993 |
90,000 | 0.031 | 98.614 |
100,000 | 0.035 | 121.552 |
Number of observations | kendallknight memory allocation (MB) | base R memory allocation (MB) |
---|---|---|
10,000 | 1.257 | 0.812 |
20,000 | 2.061 | 1.450 |
30,000 | 3.091 | 2.175 |
40,000 | 4.121 | 2.900 |
50,000 | 5.151 | 3.625 |
60,000 | 6.181 | 4.350 |
70,000 | 7.211 | 5.074 |
80,000 | 8.241 | 5.799 |
90,000 | 9.271 | 6.524 |
100,000 | 10.301 | 7.249 |
In order to avoid distorted results, we used the bench
package to run the benchmarking tests in a clean R session and in the
Niagara supercomputer cluster that, unlike personal computers, will not
distort the test results due to other processes running in the
background (e.g., such as automatic updates).
The package uses testthat
for testing [@wickham2011]. The
included tests are exhaustive and covered the complete code to check for
correctness comparing with the base R implementation, checking corner
cases, and forcing errors by passing unusable input data to the
user-visible functions.
Please note that the kendallknight project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.