Referencias

Alonso, J. C. (2022). Empezando a transformar bases de datos con r y dplyr. Universidad Icesi. https://doi.org/10.18046/EUI/bda.h.2

Alonso, J. C. (2024). Introducción al modelo clásico de regresión para científico de datos en r. Universidad Icesi. https://doi.org/XXXX

Alonso, J. C., & Arboleda, A. M. (2024). Introducción al análisis de canastas de compra para analytics translators y científicos de datos (empleando r). Universidad Icesi. https://doi.org/doi.org/10.18046/EUI/bda.h.1

Alonso, J. C., & Arboleda, A. M. (2025). Introducción al análisis de canastas de compra para analytics translators y científicos de datos (empleando r). Universidad Icesi.

Alonso, J. C., & Largo, M. F. (2023). Empezando a visualizar datos con r y ggplot2. (2. ed.). Universidad Icesi. https://doi.org/10.18046/EUI/bda.h.3.2

Alonso, J. C., & Ocampo, M. P. (2022). Empezando a usaR: Una guía paso a paso. Universidad Icesi. https://doi.org/doi.org/10.18046/EUI/bda.h.1

Aquino, J. (2023). Descr: Descriptive statistics. https://CRAN.R-project.org/package=descr

Batista, G. E., Prati, R. C., & Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter, 6(1), 20–29.

Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. Wadsworth & brooks. Cole Statistics/Probability Series.

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.

Cox, D. R., & Snell, E. J. (1989). Analysis of binary data (Vol. 32). CRC press.

Dasarathy, B. V. (1991). Nearest neighbor (NN) norms: NN pattern classification techniques. IEEE Computer Society Tutorial.

Fernihough, A. (2019). Mfx: Marginal effects, odds ratios and incidence rate ratios for GLMs. https://CRAN.R-project.org/package=mfx

Firke, S. (2023). Janitor: Simple tools for examining and cleaning dirty data. https://CRAN.R-project.org/package=janitor

Garcı́a, S., Fernández, A., Luengo, J., & Herrera, F. (2010). Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences, 180(10), 2044–2064.

He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.

Kaplan, J. (2023). fastDummies: Fast creation of dummy (binary) columns and rows from categorical variables. https://CRAN.R-project.org/package=fastDummies

Khan, M. R. A., & Brandenburger, T. (2020). ROCit: Performance assessment of binary classifier with visualization. https://CRAN.R-project.org/package=ROCit

Kuhn, & Max. (2008). Building predictive models in r using the caret package. Journal of Statistical Software, 28(5), 1–26. https://doi.org/10.18637/jss.v028.i05

Lall, U., & Sharma, A. (1996). A nearest neighbor bootstrap for resampling hydrologic time series. Water Resources Research, 32(3), 679–693.

Leeper, T. J. (2021). Margins: Marginal effects for model objects.

Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18–22. https://CRAN.R-project.org/doc/Rnews/

Majka, M. (2019). Naivebayes: High performance implementation of the naive bayes algorithm in r. https://CRAN.R-project.org/package=naivebayes

McFadden, D. (1972). Conditional logit analysis of qualitative choice behavior.

Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., & Leisch, F. (2023). e1071: Misc functions of the department of statistics, probability theory group (formerly: E1071), TU wien. https://CRAN.R-project.org/package=e1071

Milborrow, S. (2022). Rpart.plot: Plot ’rpart’ models: An enhanced version of ’plot.rpart’. https://CRAN.R-project.org/package=rpart.plot

Moro, S., Cortez, P., & Rita, P. (2014). A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62, 22–31.

Nagelkerke, N. J. et al. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78(3), 691–692.

Ng, A. (2014). Machine learning lecture notes. Stanford, CA: Stanford University. Retrieved from http://cs229. stanford ….

R Core Team. (2023). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/

Raza, A. (2023). Superstore marketing campaign dataset. Kaggle. https://www.kaggle.com/datasets/ahsan81/superstore-marketing-campaign-dataset

Therneau, T., & Atkinson, B. (2022). Rpart: Recursive partitioning and regression trees. https://CRAN.R-project.org/package=rpart

Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with s (Fourth). Springer. https://www.stats.ox.ac.uk/pub/MASS4/

Wang, S., Mathew, A., Chen, Y., Xi, L., Ma, L., & Lee, J. (2009). Empirical analysis of support vector machine ensemble classifiers. Expert Systems with Applications, 36(3), 6466–6476.

Weiss, G. M., & Provost, F. (2003). Learning when training data are costly: The effect of class distribution on tree induction. Journal of Artificial Intelligence Research, 19, 315–354.

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org

Wickham, H., François, R., Henry, L., & Müller, K. (2021). Dplyr: A grammar of data manipulation. https://CRAN.R-project.org/package=dplyr

Youden, W. J. (1950). Index for rating diagnostic tests. Cancer, 3(1), 32–35.