The nimbleMacros
package has two goals:
We intend the macros in this package to be widely useful, but also just a starting point and example for NIMBLE users to write their own macros.
To begin, load the nimbleMacros
package, which will also load the NIMBLE
package:
library(nimbleMacros)
Macro functionality is currently enabled by default in NIMBLE
.
Loading nimbleMacros
will also activate it automatically.
If you need to do so yourself, you can set the option manually:
nimbleOptions(enableMacros = TRUE)
We’ll begin by showing off the macros available in the package, and then discuss building your own macros at the end.
LM
: A macro for complete linear modelsThe LM
macro generates complete NIMBLE model code for a variety of linear, generalized linear, and generalized linear mixed models.
It uses very similar syntax to R functions like lm
, glm
, lme4::lmer
, and lme4::glmer
.
mtcars
Here’s a logistic regression in R using the built-in mtcars
dataset.
We’ll model transmission type (am
, where am = 1
is manual transmission) as a function of standardized miles per gallon.
rmod <- glm(am ~ scale(mpg), data = mtcars, family = binomial)
summary(rmod)
##
## Call:
## glm(formula = am ~ scale(mpg), family = binomial, data = mtcars)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.4351 0.4543 -0.958 0.33816
## scale(mpg) 1.8504 0.6921 2.673 0.00751 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 43.230 on 31 degrees of freedom
## Residual deviance: 29.675 on 30 degrees of freedom
## AIC: 33.675
##
## Number of Fisher Scoring iterations: 5
To create this model with the LM
macro, we’ll start by organizing the data and constants into a list.
const <- mtcars[c("am", "mpg")]
Next we’ll specify the prior distributions we want to use for the intercept and the slope coefficient(s).
This is done using the setPriors
function.
We’ll specify a uniform(-10, 10) distribution for the intercept and a normal distribution with mean 0 and precision 0.1 for the slope coefficient.
Note that these distribution definitions are wrapped in quote
; you can also supply them as strings.
pr <- setPriors(intercept = quote(dunif(-10,10)),
coefficient = quote(dnorm(0, 0.1)))
Finally, we specify the NIMBLE code, which is a single call to the LM
macro that should look familiar to our call to glm
above.
Two things to note:
scale
in the formula as with glm()
; LM
will do the scaling automatically.LM
, as this is handled automatically, with LM
assuming all data and constants are vectors of the same length. In more complex situations you can use the LINPRED
macro; see the next section.code <- nimbleCode({
LM(am ~ scale(mpg), family = binomial, priors = pr)
})
We create a NIMBLE model with this input code, and show what the final code looks like after macro expansion.
mod <- nimbleModel(code = code, constants = const)
mod$getCode()
## {
## "# LM"
## " ## FORLOOP"
## for (i_1 in 1:32) {
## am[i_1] ~ dbinom(mu_[i_1], size = 1)
## }
## " ## ----"
## " ## LINPRED"
## " ### nimbleMacros::FORLOOP"
## for (i_2 in 1:32) {
## logit(mu_[i_2]) <- beta_Intercept + beta_mpg_scaled *
## mpg_scaled[i_2]
## }
## " ### ----"
## " ### nimbleMacros::LINPRED_PRIORS"
## beta_Intercept ~ dunif(-10, 10)
## beta_mpg_scaled ~ dnorm(0, 0.1)
## " ### ----"
## " ## ----"
## "# ----"
## }
Finally, we’ll fit the model using MCMC, which yields similar results to glm
.
samples <- nimbleMCMC(mod, nchain = 1, niter = 3000, nburnin = 2000,
samplesAsCodaMCMC = TRUE)
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
summary(samples)
##
## Iterations = 1:1000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 1000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## beta_Intercept -0.4651 0.4693 0.01484 0.03274
## beta_mpg_scaled 2.0391 0.7063 0.02234 0.04881
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## beta_Intercept -1.4296 -0.7595 -0.4548 -0.134 0.4101
## beta_mpg_scaled 0.7872 1.5560 2.0039 2.464 3.5055
ChickWeights
The LM
macro can also generate code for random effects models.
To illustrate this we will use the built-in ChickWeight
dataset, which contains repeated measurements of chick weights over time.
First, fit the model in R using the lme4
package:
library(lme4)
rmod <- lmer(weight ~ Time + (1|Chick), data=ChickWeight)
summary(rmod)
## Linear mixed model fit by REML ['lmerMod']
## Formula: weight ~ Time + (1 | Chick)
## Data: ChickWeight
##
## REML criterion at convergence: 5619.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.0086 -0.5528 -0.0888 0.4975 3.4588
##
## Random effects:
## Groups Name Variance Std.Dev.
## Chick (Intercept) 717.9 26.79
## Residual 799.4 28.27
## Number of obs: 578, groups: Chick, 50
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 27.8451 4.3877 6.346
## Time 8.7261 0.1755 49.716
##
## Correlation of Fixed Effects:
## (Intr)
## Time -0.422
To fit this model with NIMBLE using LM
, we’ll first organize the ChickWeight
data into a list of constants and data for NIMBLE.
chick_const <- list(weight = ChickWeight$weight, Time = ChickWeight$Time, Chick = ChickWeight$Chick)
We can create the model with LM
using the same formula notation as lme4
with the (1|Chick)
part of the formula indicating random intercepts by chick.
code <- nimbleCode({
LM(weight ~ Time + (1|Chick))
})
Next we create the model object and show the expanded code.
mod <- nimbleModel(code = code, constants = chick_const)
mod$getCode()
## {
## "# LM"
## " ## FORLOOP"
## for (i_1 in 1:578) {
## weight[i_1] ~ dnorm(mu_[i_1], sd = sd_residual)
## }
## " ## ----"
## " ## LINPRED"
## " ### nimbleMacros::FORLOOP"
## for (i_2 in 1:578) {
## mu_[i_2] <- beta_Intercept + beta_Time * Time[i_2] +
## beta_Chick[Chick[i_2]]
## }
## " ### ----"
## " ### nimbleMacros::LINPRED_PRIORS"
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## sd_Chick ~ dunif(0, 100)
## " #### nimbleMacros::FORLOOP"
## for (i_3 in 1:50) {
## beta_Chick[i_3] ~ dnorm(0, sd = sd_Chick)
## }
## " #### ----"
## " ### ----"
## " ## ----"
## sd_residual ~ dunif(0, 100)
## "# ----"
## }
Fit the model:
samples <- nimbleMCMC(mod, nchain = 1, niter = 3000, nburnin = 2000,
samplesAsCodaMCMC = TRUE)
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
summary(samples)
##
## Iterations = 1:1000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 1000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## beta_Intercept 28.294 5.1556 0.163035 0.95444
## beta_Time 8.727 0.1888 0.005971 0.01540
## sd_Chick 27.444 3.1085 0.098299 0.25186
## sd_residual 28.376 0.9509 0.030069 0.06735
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## beta_Intercept 18.807 24.475 28.33 31.810 38.099
## beta_Time 8.375 8.593 8.73 8.858 9.098
## sd_Chick 21.609 25.371 27.27 29.478 34.189
## sd_residual 26.579 27.802 28.31 29.031 30.293
LINPRED
As the name suggests, the LINPRED
macro generates a linear predictor from a provided R formula.
It is used internally by the LM
macro and can be used separately as well.
The macro acts something like the model.matrix
function in R, although instead of creating a model matrix, it creates the NIMBLE model code for the linear predictor.
The macro supports random effects using the same syntax as the lme4
package.
To demonstrate the macro, we’ll use the built-in ChickWeight
dataset again.
In this dataset there are repeated weight measurements (weight
) of chicks (Chick
) over time (Time
).
head(ChickWeight)
## weight Time Chick Diet
## 1 42 0 1 1
## 2 51 2 1 1
## 3 59 4 1 1
## 4 64 6 1 1
## 5 76 8 1 1
## 6 93 10 1 1
To create a linear predictor with time as a covariate in R, we would specify the formula as ~Time
.
head(model.matrix(~Time, ChickWeight))
## (Intercept) Time
## 1 1 0
## 2 1 2
## 3 1 4
## 4 1 6
## 5 1 8
## 6 1 10
Based on the model matrix, the formula implies two parameters: the intercept and the coefficient/slope for time.
To generate the linear predictor NIMBLE model code, we start by setting up the constants/data list.
We’ll use the chick_const
list we used with LM
, and add the sample size N
so we can use it to define dimensions.
chick_const$N <- nrow(ChickWeight)
In the model code, we specify the parameter name for the linear predictor (mu
) and calculate it with the LINPRED
macro and the same formula as above.
Here we also set an argument priors = NULL
; we’ll come back to this later.
Notice that unlike with LM
, we explicitly define the dimensions of mu
and Time
.
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N], priors = NULL)
})
Next create the model object and view the code.
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
## }
## " ## ----"
## "# ----"
## }
NIMBLE expands the LINPRED
macro into a for
loop calculating a value of mu
for each sample in the dataset,
where mu
is a function of time.
Notice two parameters were created: beta_Intercept
(the intercept) and beta_Time
, the coefficient associated with time.
We can get a list of the parameters created by macros in the model using the getMacroParameters()
function:
mod$getMacroParameters()
## $LINPRED
## $LINPRED[[1]]
## [1] "beta_Intercept" "beta_Time"
##
##
## $`nimbleMacros::FORLOOP`
## $`nimbleMacros::FORLOOP`[[1]]
## NULL
When all the data and constants used by LINPRED
have the same dimensions, you can specify the dimensions for the left-hand-side only:
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time, priors = NULL)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
## }
## " ## ----"
## "# ----"
## }
In more complicated situations you may have different dimensions for different data/constants.
For example, we’ll convert Time
into a one-column matrix, but keep mu
as a vector.
chick_const2 <- chick_const
chick_const2$Time <- matrix(chick_const2$Time, ncol = 1)
str(chick_const2)
## List of 4
## $ weight: num [1:578] 42 51 59 64 76 93 106 125 149 171 ...
## $ Time : num [1:578, 1] 0 2 4 6 8 10 12 14 16 18 ...
## $ Chick : Ord.factor w/ 50 levels "18"<"16"<"15"<..: 15 15 15 15 15 15 15 15 15 15 ...
## $ N : int 578
To handle this we just need to specify different dimensions for each in the call to LINPRED
.
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N,1], priors = NULL)
})
mod <- nimbleModel(code, chick_const2)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1, 1]
## }
## " ## ----"
## "# ----"
## }
The LINPRED
macro can automatically handle categorical covariates (what R calls factors).
A categorical covariate can be provided in the constants as either an R factor, or as a character vector/matrix/array (which will be converted automatically to a factor).
For example, suppose we included a covariate for diet type (Diet
) in the model.
The Diet
covariate is already coded as an R factor:
class(ChickWeight$Diet)
## [1] "factor"
levels(ChickWeight$Diet)
## [1] "1" "2" "3" "4"
chick_const$Diet <- ChickWeight$Diet
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N] + Diet[1:N], priors = NULL)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Diet[Diet[i_1]]
## }
## " ## ----"
## "# ----"
## }
A new vector parameter beta_Diet
is created with four elements, one per level of Diet
.
The actual parameter Diet
is then converted to numeric and used as the index for this vector parameter.
It isn’t obvious from this code, but the first element of beta_Diet
will be fixed at 0 (serving as the reference level).
We’ll show this below.
We could also add a link function to the left-hand-side of the linear predictor calculation.
Suppose we wanted to force mu
to be positive; we could use a log link function, specifying this with the link
argument.
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N], priors = NULL, link = log)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## log(mu[i_1]) <- beta_Intercept + beta_Time * Time[i_1]
## }
## " ## ----"
## "# ----"
## }
The LINPRED
macro supports a handful of functions commonly included in formulas: offset()
, scale()
, log()
, and I()
.
For example, to build a linear predictor with the Time
covariate scaled to have mean 0 and SD 1:
code <- nimbleCode({
mu[1:N] <- LINPRED(~scale(Time[1:N]), priors = NULL, link = log)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## log(mu[i_1]) <- beta_Intercept + beta_Time_scaled * Time_scaled[i_1]
## }
## " ## ----"
## "# ----"
## }
This adds a new constant Time_scaled
to the constants:
head(mod$getConstants()$Time_scaled)
## [1] -1.5858774 -1.2899493 -0.9940213 -0.6980932 -0.4021652 -0.1062371
LINPRED
uses a modular system to handle formula functions.
You can add support for additional functions in LINPRED
formulas by writing a function with a specific name, class, and output.
See the source code of the package, specifically the file formulaFunction.R
, for examples of how to create these functions.
If we do not manually set the priors
argument, LINPRED
will automatically generate a default set of priors for the two parameters it created (beta_Intercept
and beta_Time
).
Internally, LINPRED
is calling the LINPRED_PRIORS
macro (described below).
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N] + Diet[1:N])
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Diet[Diet[i_1]]
## }
## " ## ----"
## " ## nimbleMacros::LINPRED_PRIORS"
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## beta_Diet[1] <- 0
## beta_Diet[2] ~ dnorm(0, sd = 1000)
## beta_Diet[3] ~ dnorm(0, sd = 1000)
## beta_Diet[4] ~ dnorm(0, sd = 1000)
## " ## ----"
## "# ----"
## }
We now have priors for each parameter.
Note that as mentioned previously, the first value of the beta_Diet
vector is fixed at 0, serving as the reference level.
Warning: one should check the default priors carefully to make sure that they make sense for your problem. In particular, if the values in your data are of magnitudes such that parameters of the linear predictor could be large in magnitude, the default priors could have a strong influence on your results.
We can also suppress the added macro comments to make things a little more concise.
nimbleOptions(enableMacroComments = FALSE)
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N])
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
## }
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## }
The LINPRED
macro supports random effects using lme4
syntax.
For example, a random intercept by Chick
:
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N] + (1|Chick[1:N]))
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Chick[Chick[i_1]]
## }
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## sd_Chick ~ dunif(0, 100)
## for (i_2 in 1:50) {
## beta_Chick[i_2] ~ dnorm(0, sd = sd_Chick)
## }
## }
Note that by default, all random effects are mean 0, the same approach used by lme4
.
We can also get both random intercepts and (time) slopes by Chick
; by default these are correlated.
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N] + (Time[1:N]|Chick[1:N]))
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Chick[Chick[i_1]] +
## beta_Time_Chick[Chick[i_1]] * Time[i_1]
## }
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## sd_Chick ~ dunif(0, 100)
## sd_Time_Chick ~ dunif(0, 100)
## re_sds_Chick[1] <- sd_Chick
## re_sds_Chick[2] <- sd_Time_Chick
## Ustar_Chick[1:2, 1:2] ~ dlkj_corr_cholesky(1, 2)
## U_Chick[1:2, 1:2] <- uppertri_mult_diag(Ustar_Chick[1:2,
## 1:2], re_sds_Chick[1:2])
## re_means_Chick[1] <- 0
## re_means_Chick[2] <- 0
## for (i_2 in 1:50) {
## B_Chick[i_2, 1:2] ~ dmnorm(re_means_Chick[1:2], cholesky = U_Chick[1:2,
## 1:2], prec_param = 0)
## beta_Chick[i_2] <- B_Chick[i_2, 1]
## beta_Time_Chick[i_2] <- B_Chick[i_2, 2]
## }
## }
The correlated random effects model here uses an LKJ distribution prior.
As with lme4
we can also specify uncorrelated random slopes and intercepts using ||
instead of |
.
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N] + (Time[1:N]||Chick[1:N]))
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Chick[Chick[i_1]] +
## beta_Time_Chick[Chick[i_1]] * Time[i_1]
## }
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## sd_Chick ~ dunif(0, 100)
## for (i_2 in 1:50) {
## beta_Chick[i_2] ~ dnorm(0, sd = sd_Chick)
## }
## sd_Time_Chick ~ dunif(0, 100)
## for (i_3 in 1:50) {
## beta_Time_Chick[i_3] ~ dnorm(0, sd = sd_Time_Chick)
## }
## }
LINPRED
in a complete modelUp to this point we’ve generated just the code needed to calculate the linear predictor.
Here’s a complete linear regression model of chick weight as a function of time making use of LINPRED
(similar to the code generated by LM
):
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N])
weight[1:N] ~ FORLOOP(dnorm(mu[1:N], sd = sigma))
sigma ~ dunif(0, 100)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
## }
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## for (i_2 in 1:N) {
## weight[i_2] ~ dnorm(mu[i_2], sd = sigma)
## }
## sigma ~ dunif(0, 100)
## }
LINPRED_PRIORS
This macro generates a set of priors for parameters of a linear predictor, also based on the R formula.
In most cases, we won’t use this macro directly; rather it will be called automatically when we use LINPRED
(see discussion of priors
argument above).
However, if for some reason we want to generate code for only the priors, and not the corresponding linear predictor, we can use LINPRED_PRIORS
separately as we show below.
The formula ~Time
implies two parameters as we described above, and LINPRED_PRIORS
will generate a prior for each depending on the type of the parameter (e.g. intercept vs. slope).
It is not necessary to specify the dimensions in the formula when using LINPRED_PRIORS
directly.
code <- nimbleCode({
LINPRED_PRIORS(~Time)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## }
We can also manually set the values for different types of priors such as intercepts and coefficients (slopes).
See ?setPriors
for more details.
pr <- setPriors(intercept = quote(dnorm(-5, 5)),
coefficient = quote(dnorm(0, sd = 5)))
code <- nimbleCode({
LINPRED_PRIORS(~Time + Diet, priors = pr)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## beta_Intercept ~ dnorm(-5, 5)
## beta_Time ~ dnorm(0, sd = 5)
## beta_Diet[1] <- 0
## beta_Diet[2] ~ dnorm(0, sd = 5)
## beta_Diet[3] ~ dnorm(0, sd = 5)
## beta_Diet[4] ~ dnorm(0, sd = 5)
## }
Treating beta_Diet
as a vector parameter makes sense in the NIMBLE model code, but it does not match the way categorical parameters are handled by model.matrix
; i.e., dummy variables:
colnames(model.matrix(~Time + Diet, ChickWeight))
## [1] "(Intercept)" "Time" "Diet2" "Diet3" "Diet4"
Here model.matrix
has created several separate parameters Diet2
, Diet3
, and Diet4
rather than a vector.
We can replicate this structure in NIMBLE model code by setting the LINPRED_PRIORS
option modelMatrixNames = TRUE
.
code <- nimbleCode({
LINPRED_PRIORS(~Time + Diet, priors = pr, modelMatrixNames = TRUE)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## beta_Intercept ~ dnorm(-5, 5)
## beta_Time ~ dnorm(0, sd = 5)
## beta_Diet[1] <- 0
## beta_Diet[2] <- beta_Diet2
## beta_Diet2 ~ dnorm(0, sd = 5)
## beta_Diet[3] <- beta_Diet3
## beta_Diet3 ~ dnorm(0, sd = 5)
## beta_Diet[4] <- beta_Diet4
## beta_Diet4 ~ dnorm(0, sd = 5)
## }
When this option is set, a few extra lines of NIMBLE model code are added to yield parameter names that match model.matrix
.
FORLOOP
The FORLOOP
macro is used by LM
, LINPRED
, and LINPRED_PRIORS
and is a shorthand way of creating for
loops (possibly nested) from a single line of code.
For example, suppose you want to calculate a linear predictor mu
from N
values of a covariate x
.
The NIMBLE model code would be:
nimbleCode({
for (i in 1:N){
mu[i] <- alpha + beta * x[i]
}
})
The FORLOOP
macro condenses this information into a single line using index information inside parameter brackets:
code <- nimbleCode({
mu[1:N] <- FORLOOP(alpha + beta * x[1:N])
})
Here’s the result after expansion, after creating some example values for the constants.
constants <- list(x = rnorm(5), N = 5)
mod <- nimbleModel(code, constants)
mod$getCode()
## {
## for (i_1 in 1:N) {
## mu[i_1] <- alpha + beta * x[i_1]
## }
## }
Note that one can use vectorization in NIMBLE model code directly, such as
mu[1:N] <- alpha + beta * X[1:N]
The distinction is that that creates a single multivariate node, mu[1:N]
, while the loop creates N
separate scalar nodes.
Which is best will depend on the model structure and the algorithm used with the model as discussed in the NIMBLE user manual.
It may seem a bit trivial to make a macro for such a simple chunk of code. However it can be nice to use this when you have nested for loops, which are a bit more complicated.
constants <- list(x = matrix(rnorm(10), 5, 2), N = 5, J = 2)
code <- nimbleCode({
mu[1:N, 1:J] <- FORLOOP(alpha + beta * x[1:N, 1:J])
})
mod <- nimbleModel(code, constants)
mod$getCode()
## {
## for (i_1 in 1:N) {
## for (i_2 in 1:J) {
## mu[i_1, i_2] <- alpha + beta * x[i_1, i_2]
## }
## }
## }
Note that one place you can’t use this macro directly are situations when you use the same index in multiple nested loops.
For example, suppose the parameter mu
shown above was a square matrix.
constants <- list(x = matrix(rnorm(25), 5, 5), N = 5)
code <- nimbleCode({
mu[1:N, 1:N] <- FORLOOP(alpha + beta * x[1:N, 1:N])
})
mod <- nimbleModel(code, constants)
## Error : Not sure how to expand duplicated index 1:N in code:
## mu[1:N, 1:N]
## Error: Model macro FORLOOP failed.
The structure of this for
loop with duplicated indices could be ambiguous in more complex situations, so FORLOOP
errors.
The solution in this case would simply be to provide different variables for each dimension, even if they are the same value.
constants <- list(x = matrix(rnorm(25), 5, 5), N = 5, K = 5)
code <- nimbleCode({
mu[1:N, 1:K] <- FORLOOP(alpha + beta * x[1:N, 1:K])
})
mod <- nimbleModel(code, constants)
mod$getCode()
## {
## for (i_1 in 1:N) {
## for (i_2 in 1:K) {
## mu[i_1, i_2] <- alpha + beta * x[i_1, i_2]
## }
## }
## }
Suppose you wanted to create a macro called MYDIST
, which creates a normal distribution with mean 0 and provided standard deviation.
The macro would look like this in the NIMBLE model code:
y ~ MYDIST(sdev = 2)
Note that by convention, we define the macro name in all capital letters.
All the provided macros in nimbleMacros
follow this convention.
We want the final, processed output to look like:
y ~ dnorm(0, sd = 2)
A macro is, fundamentally, a specially-formatted R function that converts one line of code to another. For example:
fun <- function(input){
lhs <- input[[2]]
sdev <- input[[3]]$sdev
output <- substitute(LHS ~ dnorm(0, sd = SDEV),
list(LHS = lhs, SDEV = sdev))
output
}
Note that we have to do a bit of clunky code-processing to extract the left-hand-side of the assignment and the argument value.
(The use of the [[
extraction with a code object extracts information from the abstract syntax tree representation of an R expression.)
input <- quote(y ~ MYDIST(sdev = 2))
fun(input)
## y ~ dnorm(0, sd = 2)
In practice, nimble
expects the macro function converter (fun
in the example above) to have two additional arguments: modelInfo
and .env
.
When processing a macro, nimble
will provide a named list to the modelInfo
argument which contains some additional model information.
Most importantly, this list includes constants
, which can be used and/or changed by the macro.
And, as you can probably guess, nimble
provides the R environment in which nimbleModel
was called to the .env
argument, to help with scoping issues.
In addition to these two arguments, the function also needs to return a particular structure in the output.
The output should be a named list with two elements: code
, which contains the output code (as in the original function above), and modelInfo
which is the list of model information, which we could modify or leave as-is.
fun <- function(input_code, modelInfo, .env){
LHS <- input_code[[2]]
sdev <- input_code[[3]]$sdev
output_code <- substitute(LHS ~ dnorm(0, sd = SDEV),
list(LHS = LHS, SDEV = sdev))
# Modify the constants
modelInfo$constants$new <- 1
list(code = output_code, modelInfo = modelInfo)
}
input <- quote(y ~ MYDIST(sdev = 2))
fun(input, modelInfo = list(constants = NULL), .env = environment())
## $code
## y ~ dnorm(0, sd = 2)
##
## $modelInfo
## $modelInfo$constants
## $modelInfo$constants$new
## [1] 1
Finally, nimble
needs a way to know that this function is intended to be a macro, and that it is called MYDIST
.
The final step then is to create a wrapper list
with our function inserted as an element called process
, and assign the class "model_macro"
.
(In the next section we’ll see that the helper function buildMacro
will create this wrapper list for us.)
MYDIST <- list(process = fun)
class(MYDIST) <- "model_macro"
Now, we can test the macro in a NIMBLE model:
code <- nimbleCode({
y ~ MYDIST(sdev = 2)
})
mod <- nimbleModel(code = code, data = list(y = 1))
mod$getCode()
## {
## y ~ dnorm(0, sd = 2)
## }
buildMacro
to create the macroManually creating the macro is fine, but it gets complicated if we want to extract a lot of separate pieces of information from the code line, e.g., if our macro has many arguments.
As an alternative, we can use the buildMacro
function included with nimble
, which handles some of this code processing for us.
The first argument to buildMacro
, is a function structured like the one we wrote above fun
.
The other two arguments use3pieces
and unpackArgs
determine how the arguments of fun
should be structured.
First, suppose both use3pieces
and unpackArgs
are set to FALSE
.
In this case, the first argument to fun
is the entire original line of code containing the macro, i.e., y ~ MYDIST(sdev = 2)
.
This is identical to how we wrote the function above.
In this case, all buildMacro
will do is create the required macro list structure and assign the class for us.
MYDIST2 <- buildMacro(fun, use3pieces = FALSE, unpackArgs = FALSE)
code <- nimbleCode({
y ~ MYDIST2(sdev = 2)
})
mod <- nimbleModel(code = code, data = list(y = 1))
mod$getCode()
## {
## y ~ dnorm(0, sd = 2)
## }
Now we will set use3pieces = TRUE
.
In this case, buildMacro
will automatically do some initial processing on the code line for us.
Specifically, it will identify if the assignment is stochastic (stoch
, TRUE
if ~
and FALSE
if <-
), and separate and return the left-hand-side (LHS
) and right-hand-side (RHS
) of the assignment.
With these settings, our function’s first three arguments need to be stoch
, LHS
, and RHS
, in that order (plus we keep the always-required modelInfo
and .env
).
fun3 <- function(stoch, lhs, rhs, modelInfo, .env){
sdev <- rhs$sdev
output_code <- substitute(LHS ~ dnorm(0, sd = SDEV),
list(LHS = lhs, SDEV = sdev))
# Modify the constants
modelInfo$constants$new <- 1
list(code = output_code, modelInfo = modelInfo)
}
We do a little less code processing this time since buildMacro
has already separated LHS and RHS for us.
Note that in this simple example we aren’t using stoch
because we want our output code to always be stochastic.
MYDIST3 <- buildMacro(fun3, use3pieces = TRUE, unpackArgs = FALSE)
code <- nimbleCode({
y ~ MYDIST3(sdev = 2)
})
mod <- nimbleModel(code = code, data = list(y = 1))
## Error in (function (stoch, lhs, rhs, modelInfo, .env) :
## unused arguments (LHS = quote(y), RHS = quote(MYDIST3(sdev = 2)))
## Error: Model macro MYDIST3 failed.
mod$getCode()
## {
## y ~ dnorm(0, sd = 2)
## }
Finally, consider the case where we also set unpackArgs = TRUE
.
Now, buildMacro
will dig into the right-hand-side of the macro and extract the argument values for us (in this case just sdev
).
Thus, we need to include all these arguments explicitly as arguments in our function.
fun4 <- function(stoch, lhs, sdev, modelInfo, .env){
output_code <- substitute(LHS ~ dnorm(0, sd = SDEV),
list(LHS = lhs, SDEV = sdev))
# Modify the constants
modelInfo$constants$new <- 1
list(code = output_code, modelInfo = modelInfo)
}
Even less code processing is required in this case, since buildMacro
will split out the sdev
argument for us.
MYDIST4 <- buildMacro(fun4, use3pieces = TRUE, unpackArgs = TRUE)
code <- nimbleCode({
y ~ MYDIST4(sdev = 2)
})
mod <- nimbleModel(code = code, data = list(y = 1))
## Error in (function (stoch, lhs, sdev, modelInfo, .env) :
## unused argument (LHS = quote(y))
## Error: Model macro MYDIST4 failed.
mod$getCode()
## {
## y ~ dnorm(0, sd = 2)
## }
The choice of arguments will depend on the macro: sometimes the short-cuts enabled by buildMacro
will be useful, other times you may need more manual control over how the input code line is processed.
See ?buildMacro
for additional detailed documentation and examples, including an example when use3pieces = FALSE
and unpackArgs = TRUE
.
You may also find the source code for the macros included in nimbleMacros
, found here, a useful resource for writing your own.