The goal of orbital is to enable running predictions of tidymodels workflows inside databases.
To install it, use:
install.packages("orbital")
You can install the development version of orbital from GitHub with:
# install.packages("devtools")
::install_github("tidymodels/orbital") devtools
Given a fitted workflow
library(tidymodels)
<- recipe(mpg ~ ., data = mtcars) |>
rec_spec step_normalize(all_numeric_predictors())
<- linear_reg()
lm_spec
<- workflow(rec_spec, lm_spec)
wf_spec
<- fit(wf_spec, mtcars) wf_fit
You can predict with it like normal.
predict(wf_fit, mtcars)
#> # A tibble: 32 × 1
#> .pred
#> <dbl>
#> 1 22.6
#> 2 22.1
#> 3 26.3
#> 4 21.2
#> 5 17.7
#> 6 20.4
#> 7 14.4
#> 8 22.5
#> 9 24.4
#> 10 18.7
#> # ℹ 22 more rows
We can get the same results by first creating an orbital object
library(orbital)
<- orbital(wf_fit)
orbital_obj
orbital_obj#>
#> ── orbital Object ──────────────────────────────────────────────────────────────
#> • cyl = (cyl - 6.1875) / 1.785922
#> • disp = (disp - 230.7219) / 123.9387
#> • hp = (hp - 146.6875) / 68.56287
#> • drat = (drat - 3.596562) / 0.5346787
#> • wt = (wt - 3.21725) / 0.9784574
#> • qsec = (qsec - 17.84875) / 1.786943
#> • vs = (vs - 0.4375) / 0.5040161
#> • am = (am - 0.40625) / 0.4989909
#> • gear = (gear - 3.6875) / 0.7378041
#> • carb = (carb - 2.8125) / 1.6152
#> • .pred = 20.09062 + (cyl * -0.199024) + (disp * 1.652752) + (hp * -1.472 ...
#> ────────────────────────────────────────────────────────────────────────────────
#> 11 equations in total.
and then “predicting” with it using predict()
to get the
same results
predict(orbital_obj, as_tibble(mtcars))
#> # A tibble: 32 × 1
#> .pred
#> <dbl>
#> 1 22.6
#> 2 22.1
#> 3 26.3
#> 4 21.2
#> 5 17.7
#> 6 20.4
#> 7 14.4
#> 8 22.5
#> 9 24.4
#> 10 18.7
#> # ℹ 22 more rows
you can also predict in most SQL databases
library(DBI)
library(RSQLite)
<- dbConnect(SQLite(), path = ":memory:")
con <- copy_to(con, mtcars)
db_mtcars
predict(orbital_obj, db_mtcars)
#> # Source: SQL [?? x 1]
#> # Database: sqlite 3.47.0 []
#> .pred
#> <dbl>
#> 1 22.6
#> 2 22.1
#> 3 26.3
#> 4 21.2
#> 5 17.7
#> 6 20.4
#> 7 14.4
#> 8 22.5
#> 9 24.4
#> 10 18.7
#> # ℹ more rows
and spark databases
library(sparklyr)
#>
#> Attaching package: 'sparklyr'
#> The following object is masked from 'package:purrr':
#>
#> invoke
#> The following object is masked from 'package:stats':
#>
#> filter
<- spark_connect(master = "local")
sc
<- copy_to(sc, mtcars, overwrite = TRUE)
sc_mtcars
predict(orbital_obj, sc_mtcars)
#> # Source: SQL [?? x 1]
#> # Database: spark_connection
#> .pred
#> <dbl>
#> 1 22.6
#> 2 22.1
#> 3 26.3
#> 4 21.2
#> 5 17.7
#> 6 20.4
#> 7 14.4
#> 8 22.5
#> 9 24.4
#> 10 18.7
#> # ℹ more rows
Full list of supported models and recipes steps can be found here:
vignette("supported-models")
.
This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
For questions and discussions about tidymodels packages, modeling, and machine learning, please post on Posit Community.
If you think you have encountered a bug, please submit an issue.
Either way, learn how to create and share a reprex (a minimal, reproducible example), to clearly communicate about your code.
Check out further details on contributing guidelines for tidymodels packages and how to get help.