Introduction

The nimbleMacros package has two goals:

Provide a set of macros for linear modeling that may be used by both end users and R package developers.
Demonstrate the use of macros in NIMBLE models.

We intend the macros in this package to be widely useful, but also just a starting point and example for NIMBLE users to write their own macros. To begin, load the nimbleMacros package, which will also load the NIMBLE package:

library(nimbleMacros)

Macro functionality is currently enabled by default in NIMBLE. Loading nimbleMacros will also activate it automatically. If you need to do so yourself, you can set the option manually:

nimbleOptions(enableMacros = TRUE)

We’ll begin by showing off the macros available in the package, and then discuss building your own macros at the end.

`LM`: A macro for complete linear models

The LM macro generates complete NIMBLE model code for a variety of linear, generalized linear, and generalized linear mixed models. It uses very similar syntax to R functions like lm, glm, lme4::lmer, and lme4::glmer.

Example logistic regression with `mtcars`

Here’s a logistic regression in R using the built-in mtcars dataset. We’ll model transmission type (am, where am = 1 is manual transmission) as a function of standardized miles per gallon.

rmod <- glm(am ~ scale(mpg), data = mtcars, family = binomial)
summary(rmod)

## 
## Call:
## glm(formula = am ~ scale(mpg), family = binomial, data = mtcars)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  -0.4351     0.4543  -0.958  0.33816   
## scale(mpg)    1.8504     0.6921   2.673  0.00751 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 43.230  on 31  degrees of freedom
## Residual deviance: 29.675  on 30  degrees of freedom
## AIC: 33.675
## 
## Number of Fisher Scoring iterations: 5

To create this model with the LM macro, we’ll start by organizing the data and constants into a list.

const <- mtcars[c("am", "mpg")]

Next we’ll specify the prior distributions we want to use for the intercept and the slope coefficient(s). This is done using the setPriors function. We’ll specify a uniform(-10, 10) distribution for the intercept and a normal distribution with mean 0 and precision 0.1 for the slope coefficient. Note that these distribution definitions are wrapped in quote; you can also supply them as strings.

pr <- setPriors(intercept = quote(dunif(-10,10)), 
                coefficient = quote(dnorm(0, 0.1)))

Finally, we specify the NIMBLE code, which is a single call to the LM macro that should look familiar to our call to glm above.

Two things to note:

We include a call to scale in the formula as with glm(); LM will do the scaling automatically.
You don’t need to specify dimensions of the variables with LM, as this is handled automatically, with LM assuming all data and constants are vectors of the same length. In more complex situations you can use the LINPRED macro; see the next section.

code <- nimbleCode({
  LM(am ~ scale(mpg), family = binomial, priors = pr)
})

We create a NIMBLE model with this input code, and show what the final code looks like after macro expansion.

mod <- nimbleModel(code = code, constants = const)
mod$getCode()

## {
##     "# LM"
##     "  ## FORLOOP"
##     for (i_1 in 1:32) {
##         am[i_1] ~ dbinom(mu_[i_1], size = 1)
##     }
##     "  ## ----"
##     "  ## LINPRED"
##     "    ### nimbleMacros::FORLOOP"
##     for (i_2 in 1:32) {
##         logit(mu_[i_2]) <- beta_Intercept + beta_mpg_scaled * 
##             mpg_scaled[i_2]
##     }
##     "    ### ----"
##     "    ### nimbleMacros::LINPRED_PRIORS"
##     beta_Intercept ~ dunif(-10, 10)
##     beta_mpg_scaled ~ dnorm(0, 0.1)
##     "    ### ----"
##     "  ## ----"
##     "# ----"
## }

Finally, we’ll fit the model using MCMC, which yields similar results to glm.

samples <- nimbleMCMC(mod, nchain = 1, niter = 3000, nburnin = 2000,
                      samplesAsCodaMCMC = TRUE)

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

summary(samples)

## 
## Iterations = 1:1000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 1000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##                    Mean     SD Naive SE Time-series SE
## beta_Intercept  -0.4651 0.4693  0.01484        0.03274
## beta_mpg_scaled  2.0391 0.7063  0.02234        0.04881
## 
## 2. Quantiles for each variable:
## 
##                    2.5%     25%     50%    75%  97.5%
## beta_Intercept  -1.4296 -0.7595 -0.4548 -0.134 0.4101
## beta_mpg_scaled  0.7872  1.5560  2.0039  2.464 3.5055

Example linear mixed model using `ChickWeights`

The LM macro can also generate code for random effects models. To illustrate this we will use the built-in ChickWeight dataset, which contains repeated measurements of chick weights over time. First, fit the model in R using the lme4 package:

library(lme4)
rmod <- lmer(weight ~ Time + (1|Chick), data=ChickWeight)
summary(rmod)

## Linear mixed model fit by REML ['lmerMod']
## Formula: weight ~ Time + (1 | Chick)
##    Data: ChickWeight
## 
## REML criterion at convergence: 5619.4
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.0086 -0.5528 -0.0888  0.4975  3.4588 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Chick    (Intercept) 717.9    26.79   
##  Residual             799.4    28.27   
## Number of obs: 578, groups:  Chick, 50
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  27.8451     4.3877   6.346
## Time          8.7261     0.1755  49.716
## 
## Correlation of Fixed Effects:
##      (Intr)
## Time -0.422

To fit this model with NIMBLE using LM, we’ll first organize the ChickWeight data into a list of constants and data for NIMBLE.

chick_const <- list(weight = ChickWeight$weight, Time = ChickWeight$Time, Chick = ChickWeight$Chick)

We can create the model with LM using the same formula notation as lme4 with the (1|Chick) part of the formula indicating random intercepts by chick.

code <- nimbleCode({
  LM(weight ~ Time + (1|Chick))
})

Next we create the model object and show the expanded code.

mod <- nimbleModel(code = code, constants = chick_const)
mod$getCode()

## {
##     "# LM"
##     "  ## FORLOOP"
##     for (i_1 in 1:578) {
##         weight[i_1] ~ dnorm(mu_[i_1], sd = sd_residual)
##     }
##     "  ## ----"
##     "  ## LINPRED"
##     "    ### nimbleMacros::FORLOOP"
##     for (i_2 in 1:578) {
##         mu_[i_2] <- beta_Intercept + beta_Time * Time[i_2] + 
##             beta_Chick[Chick[i_2]]
##     }
##     "    ### ----"
##     "    ### nimbleMacros::LINPRED_PRIORS"
##     beta_Intercept ~ dnorm(0, sd = 1000)
##     beta_Time ~ dnorm(0, sd = 1000)
##     sd_Chick ~ dunif(0, 100)
##     "      #### nimbleMacros::FORLOOP"
##     for (i_3 in 1:50) {
##         beta_Chick[i_3] ~ dnorm(0, sd = sd_Chick)
##     }
##     "      #### ----"
##     "    ### ----"
##     "  ## ----"
##     sd_residual ~ dunif(0, 100)
##     "# ----"
## }

Fit the model:

samples <- nimbleMCMC(mod, nchain = 1, niter = 3000, nburnin = 2000,
                      samplesAsCodaMCMC = TRUE)

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

summary(samples)

## 
## Iterations = 1:1000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 1000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##                  Mean     SD Naive SE Time-series SE
## beta_Intercept 28.294 5.1556 0.163035        0.95444
## beta_Time       8.727 0.1888 0.005971        0.01540
## sd_Chick       27.444 3.1085 0.098299        0.25186
## sd_residual    28.376 0.9509 0.030069        0.06735
## 
## 2. Quantiles for each variable:
## 
##                  2.5%    25%   50%    75%  97.5%
## beta_Intercept 18.807 24.475 28.33 31.810 38.099
## beta_Time       8.375  8.593  8.73  8.858  9.098
## sd_Chick       21.609 25.371 27.27 29.478 34.189
## sd_residual    26.579 27.802 28.31 29.031 30.293

`LINPRED`

As the name suggests, the LINPRED macro generates a linear predictor from a provided R formula. It is used internally by the LM macro and can be used separately as well. The macro acts something like the model.matrix function in R, although instead of creating a model matrix, it creates the NIMBLE model code for the linear predictor. The macro supports random effects using the same syntax as the lme4 package.

To demonstrate the macro, we’ll use the built-in ChickWeight dataset again. In this dataset there are repeated weight measurements (weight) of chicks (Chick) over time (Time).

head(ChickWeight)

##   weight Time Chick Diet
## 1     42    0     1    1
## 2     51    2     1    1
## 3     59    4     1    1
## 4     64    6     1    1
## 5     76    8     1    1
## 6     93   10     1    1

Generate a linear predictor from a formula

To create a linear predictor with time as a covariate in R, we would specify the formula as ~Time.

head(model.matrix(~Time, ChickWeight))

##   (Intercept) Time
## 1           1    0
## 2           1    2
## 3           1    4
## 4           1    6
## 5           1    8
## 6           1   10

Based on the model matrix, the formula implies two parameters: the intercept and the coefficient/slope for time.

To generate the linear predictor NIMBLE model code, we start by setting up the constants/data list. We’ll use the chick_const list we used with LM, and add the sample size N so we can use it to define dimensions.

chick_const$N <- nrow(ChickWeight)

In the model code, we specify the parameter name for the linear predictor (mu) and calculate it with the LINPRED macro and the same formula as above. Here we also set an argument priors = NULL; we’ll come back to this later. Notice that unlike with LM, we explicitly define the dimensions of mu and Time.

code <- nimbleCode({
  mu[1:N] <- LINPRED(~Time[1:N], priors = NULL)
})

Next create the model object and view the code.

mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     "# LINPRED"
##     "  ## nimbleMacros::FORLOOP"
##     for (i_1 in 1:N) {
##         mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
##     }
##     "  ## ----"
##     "# ----"
## }

NIMBLE expands the LINPRED macro into a for loop calculating a value of mu for each sample in the dataset, where mu is a function of time. Notice two parameters were created: beta_Intercept (the intercept) and beta_Time, the coefficient associated with time.

We can get a list of the parameters created by macros in the model using the getMacroParameters() function:

mod$getMacroParameters()

## $LINPRED
## $LINPRED[[1]]
## [1] "beta_Intercept" "beta_Time"     
## 
## 
## $`nimbleMacros::FORLOOP`
## $`nimbleMacros::FORLOOP`[[1]]
## NULL

When all the data and constants used by LINPRED have the same dimensions, you can specify the dimensions for the left-hand-side only:

code <- nimbleCode({
  mu[1:N] <- LINPRED(~Time, priors = NULL)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     "# LINPRED"
##     "  ## nimbleMacros::FORLOOP"
##     for (i_1 in 1:N) {
##         mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
##     }
##     "  ## ----"
##     "# ----"
## }

In more complicated situations you may have different dimensions for different data/constants. For example, we’ll convert Time into a one-column matrix, but keep mu as a vector.

chick_const2 <- chick_const
chick_const2$Time <- matrix(chick_const2$Time, ncol = 1)
str(chick_const2)

## List of 4
##  $ weight: num [1:578] 42 51 59 64 76 93 106 125 149 171 ...
##  $ Time  : num [1:578, 1] 0 2 4 6 8 10 12 14 16 18 ...
##  $ Chick : Ord.factor w/ 50 levels "18"<"16"<"15"<..: 15 15 15 15 15 15 15 15 15 15 ...
##  $ N     : int 578

To handle this we just need to specify different dimensions for each in the call to LINPRED.

code <- nimbleCode({
  mu[1:N] <- LINPRED(~Time[1:N,1], priors = NULL)
})
mod <- nimbleModel(code, chick_const2)
mod$getCode()

## {
##     "# LINPRED"
##     "  ## nimbleMacros::FORLOOP"
##     for (i_1 in 1:N) {
##         mu[i_1] <- beta_Intercept + beta_Time * Time[i_1, 1]
##     }
##     "  ## ----"
##     "# ----"
## }

Factor/categorical covariates

The LINPRED macro can automatically handle categorical covariates (what R calls factors). A categorical covariate can be provided in the constants as either an R factor, or as a character vector/matrix/array (which will be converted automatically to a factor). For example, suppose we included a covariate for diet type (Diet) in the model. The Diet covariate is already coded as an R factor:

class(ChickWeight$Diet)

## [1] "factor"

levels(ChickWeight$Diet)

## [1] "1" "2" "3" "4"

chick_const$Diet <- ChickWeight$Diet

code <- nimbleCode({
  mu[1:N] <- LINPRED(~Time[1:N] + Diet[1:N], priors = NULL)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     "# LINPRED"
##     "  ## nimbleMacros::FORLOOP"
##     for (i_1 in 1:N) {
##         mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Diet[Diet[i_1]]
##     }
##     "  ## ----"
##     "# ----"
## }

A new vector parameter beta_Diet is created with four elements, one per level of Diet. The actual parameter Diet is then converted to numeric and used as the index for this vector parameter. It isn’t obvious from this code, but the first element of beta_Diet will be fixed at 0 (serving as the reference level). We’ll show this below.

Add a link function

We could also add a link function to the left-hand-side of the linear predictor calculation. Suppose we wanted to force mu to be positive; we could use a log link function, specifying this with the link argument.

code <- nimbleCode({
  mu[1:N] <- LINPRED(~Time[1:N], priors = NULL, link = log)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     "# LINPRED"
##     "  ## nimbleMacros::FORLOOP"
##     for (i_1 in 1:N) {
##         log(mu[i_1]) <- beta_Intercept + beta_Time * Time[i_1]
##     }
##     "  ## ----"
##     "# ----"
## }

Functions in formulas

The LINPRED macro supports a handful of functions commonly included in formulas: offset(), scale(), log(), and I(). For example, to build a linear predictor with the Time covariate scaled to have mean 0 and SD 1:

code <- nimbleCode({
  mu[1:N] <- LINPRED(~scale(Time[1:N]), priors = NULL, link = log)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     "# LINPRED"
##     "  ## nimbleMacros::FORLOOP"
##     for (i_1 in 1:N) {
##         log(mu[i_1]) <- beta_Intercept + beta_Time_scaled * Time_scaled[i_1]
##     }
##     "  ## ----"
##     "# ----"
## }

This adds a new constant Time_scaled to the constants:

head(mod$getConstants()$Time_scaled)

## [1] -1.5858774 -1.2899493 -0.9940213 -0.6980932 -0.4021652 -0.1062371

LINPRED uses a modular system to handle formula functions. You can add support for additional functions in LINPRED formulas by writing a function with a specific name, class, and output. See the source code of the package, specifically the file formulaFunction.R, for examples of how to create these functions.

Get default priors

If we do not manually set the priors argument, LINPRED will automatically generate a default set of priors for the two parameters it created (beta_Intercept and beta_Time). Internally, LINPRED is calling the LINPRED_PRIORS macro (described below).

code <- nimbleCode({
  mu[1:N] <- LINPRED(~Time[1:N] + Diet[1:N])
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     "# LINPRED"
##     "  ## nimbleMacros::FORLOOP"
##     for (i_1 in 1:N) {
##         mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Diet[Diet[i_1]]
##     }
##     "  ## ----"
##     "  ## nimbleMacros::LINPRED_PRIORS"
##     beta_Intercept ~ dnorm(0, sd = 1000)
##     beta_Time ~ dnorm(0, sd = 1000)
##     beta_Diet[1] <- 0
##     beta_Diet[2] ~ dnorm(0, sd = 1000)
##     beta_Diet[3] ~ dnorm(0, sd = 1000)
##     beta_Diet[4] ~ dnorm(0, sd = 1000)
##     "  ## ----"
##     "# ----"
## }

We now have priors for each parameter. Note that as mentioned previously, the first value of the beta_Diet vector is fixed at 0, serving as the reference level.

Warning: one should check the default priors carefully to make sure that they make sense for your problem. In particular, if the values in your data are of magnitudes such that parameters of the linear predictor could be large in magnitude, the default priors could have a strong influence on your results.

Suppress the macro comments

We can also suppress the added macro comments to make things a little more concise.

nimbleOptions(enableMacroComments = FALSE)

code <- nimbleCode({
  mu[1:N] <- LINPRED(~Time[1:N])
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     for (i_1 in 1:N) {
##         mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
##     }
##     beta_Intercept ~ dnorm(0, sd = 1000)
##     beta_Time ~ dnorm(0, sd = 1000)
## }

Random effects

The LINPRED macro supports random effects using lme4 syntax. For example, a random intercept by Chick:

code <- nimbleCode({
  mu[1:N] <- LINPRED(~Time[1:N] + (1|Chick[1:N]))
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     for (i_1 in 1:N) {
##         mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Chick[Chick[i_1]]
##     }
##     beta_Intercept ~ dnorm(0, sd = 1000)
##     beta_Time ~ dnorm(0, sd = 1000)
##     sd_Chick ~ dunif(0, 100)
##     for (i_2 in 1:50) {
##         beta_Chick[i_2] ~ dnorm(0, sd = sd_Chick)
##     }
## }

Note that by default, all random effects are mean 0, the same approach used by lme4. We can also get both random intercepts and (time) slopes by Chick; by default these are correlated.

code <- nimbleCode({
  mu[1:N] <- LINPRED(~Time[1:N] + (Time[1:N]|Chick[1:N]))
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     for (i_1 in 1:N) {
##         mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Chick[Chick[i_1]] + 
##             beta_Time_Chick[Chick[i_1]] * Time[i_1]
##     }
##     beta_Intercept ~ dnorm(0, sd = 1000)
##     beta_Time ~ dnorm(0, sd = 1000)
##     sd_Chick ~ dunif(0, 100)
##     sd_Time_Chick ~ dunif(0, 100)
##     re_sds_Chick[1] <- sd_Chick
##     re_sds_Chick[2] <- sd_Time_Chick
##     Ustar_Chick[1:2, 1:2] ~ dlkj_corr_cholesky(1, 2)
##     U_Chick[1:2, 1:2] <- uppertri_mult_diag(Ustar_Chick[1:2, 
##         1:2], re_sds_Chick[1:2])
##     re_means_Chick[1] <- 0
##     re_means_Chick[2] <- 0
##     for (i_2 in 1:50) {
##         B_Chick[i_2, 1:2] ~ dmnorm(re_means_Chick[1:2], cholesky = U_Chick[1:2, 
##             1:2], prec_param = 0)
##         beta_Chick[i_2] <- B_Chick[i_2, 1]
##         beta_Time_Chick[i_2] <- B_Chick[i_2, 2]
##     }
## }

The correlated random effects model here uses an LKJ distribution prior.

As with lme4 we can also specify uncorrelated random slopes and intercepts using || instead of |.

code <- nimbleCode({
  mu[1:N] <- LINPRED(~Time[1:N] + (Time[1:N]||Chick[1:N]))
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     for (i_1 in 1:N) {
##         mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Chick[Chick[i_1]] + 
##             beta_Time_Chick[Chick[i_1]] * Time[i_1]
##     }
##     beta_Intercept ~ dnorm(0, sd = 1000)
##     beta_Time ~ dnorm(0, sd = 1000)
##     sd_Chick ~ dunif(0, 100)
##     for (i_2 in 1:50) {
##         beta_Chick[i_2] ~ dnorm(0, sd = sd_Chick)
##     }
##     sd_Time_Chick ~ dunif(0, 100)
##     for (i_3 in 1:50) {
##         beta_Time_Chick[i_3] ~ dnorm(0, sd = sd_Time_Chick)
##     }
## }

Include `LINPRED` in a complete model

Up to this point we’ve generated just the code needed to calculate the linear predictor. Here’s a complete linear regression model of chick weight as a function of time making use of LINPRED (similar to the code generated by LM):

code <- nimbleCode({
  mu[1:N] <- LINPRED(~Time[1:N])

  weight[1:N] ~ FORLOOP(dnorm(mu[1:N], sd = sigma))
  sigma ~ dunif(0, 100)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     for (i_1 in 1:N) {
##         mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
##     }
##     beta_Intercept ~ dnorm(0, sd = 1000)
##     beta_Time ~ dnorm(0, sd = 1000)
##     for (i_2 in 1:N) {
##         weight[i_2] ~ dnorm(mu[i_2], sd = sigma)
##     }
##     sigma ~ dunif(0, 100)
## }

`LINPRED_PRIORS`

This macro generates a set of priors for parameters of a linear predictor, also based on the R formula. In most cases, we won’t use this macro directly; rather it will be called automatically when we use LINPRED (see discussion of priors argument above). However, if for some reason we want to generate code for only the priors, and not the corresponding linear predictor, we can use LINPRED_PRIORS separately as we show below.

The formula ~Time implies two parameters as we described above, and LINPRED_PRIORS will generate a prior for each depending on the type of the parameter (e.g. intercept vs. slope). It is not necessary to specify the dimensions in the formula when using LINPRED_PRIORS directly.

code <- nimbleCode({
  LINPRED_PRIORS(~Time)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     beta_Intercept ~ dnorm(0, sd = 1000)
##     beta_Time ~ dnorm(0, sd = 1000)
## }

We can also manually set the values for different types of priors such as intercepts and coefficients (slopes). See ?setPriors for more details.

pr <- setPriors(intercept = quote(dnorm(-5, 5)),
                coefficient = quote(dnorm(0, sd = 5)))

code <- nimbleCode({
  LINPRED_PRIORS(~Time + Diet, priors = pr)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     beta_Intercept ~ dnorm(-5, 5)
##     beta_Time ~ dnorm(0, sd = 5)
##     beta_Diet[1] <- 0
##     beta_Diet[2] ~ dnorm(0, sd = 5)
##     beta_Diet[3] ~ dnorm(0, sd = 5)
##     beta_Diet[4] ~ dnorm(0, sd = 5)
## }

Treating beta_Diet as a vector parameter makes sense in the NIMBLE model code, but it does not match the way categorical parameters are handled by model.matrix; i.e., dummy variables:

colnames(model.matrix(~Time + Diet, ChickWeight))

## [1] "(Intercept)" "Time"        "Diet2"       "Diet3"       "Diet4"

Here model.matrix has created several separate parameters Diet2, Diet3, and Diet4 rather than a vector. We can replicate this structure in NIMBLE model code by setting the LINPRED_PRIORS option modelMatrixNames = TRUE.

code <- nimbleCode({
  LINPRED_PRIORS(~Time + Diet, priors = pr, modelMatrixNames = TRUE)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()

## {
##     beta_Intercept ~ dnorm(-5, 5)
##     beta_Time ~ dnorm(0, sd = 5)
##     beta_Diet[1] <- 0
##     beta_Diet[2] <- beta_Diet2
##     beta_Diet2 ~ dnorm(0, sd = 5)
##     beta_Diet[3] <- beta_Diet3
##     beta_Diet3 ~ dnorm(0, sd = 5)
##     beta_Diet[4] <- beta_Diet4
##     beta_Diet4 ~ dnorm(0, sd = 5)
## }

When this option is set, a few extra lines of NIMBLE model code are added to yield parameter names that match model.matrix.

`FORLOOP`

The FORLOOP macro is used by LM, LINPRED, and LINPRED_PRIORS and is a shorthand way of creating for loops (possibly nested) from a single line of code. For example, suppose you want to calculate a linear predictor mu from N values of a covariate x. The NIMBLE model code would be:

nimbleCode({
  for (i in 1:N){
    mu[i] <- alpha + beta * x[i]
  }
})

The FORLOOP macro condenses this information into a single line using index information inside parameter brackets:

code <- nimbleCode({
  mu[1:N] <- FORLOOP(alpha + beta * x[1:N])
})

Here’s the result after expansion, after creating some example values for the constants.

constants <- list(x = rnorm(5), N = 5)
mod <- nimbleModel(code, constants)
mod$getCode()

## {
##     for (i_1 in 1:N) {
##         mu[i_1] <- alpha + beta * x[i_1]
##     }
## }

Note that one can use vectorization in NIMBLE model code directly, such as

mu[1:N] <- alpha + beta * X[1:N]

The distinction is that that creates a single multivariate node, mu[1:N], while the loop creates N separate scalar nodes. Which is best will depend on the model structure and the algorithm used with the model as discussed in the NIMBLE user manual.

It may seem a bit trivial to make a macro for such a simple chunk of code. However it can be nice to use this when you have nested for loops, which are a bit more complicated.

constants <- list(x = matrix(rnorm(10), 5, 2), N = 5, J = 2)

code <- nimbleCode({
  mu[1:N, 1:J] <- FORLOOP(alpha + beta * x[1:N, 1:J])
})

mod <- nimbleModel(code, constants)
mod$getCode()

## {
##     for (i_1 in 1:N) {
##         for (i_2 in 1:J) {
##             mu[i_1, i_2] <- alpha + beta * x[i_1, i_2]
##         }
##     }
## }

Note that one place you can’t use this macro directly are situations when you use the same index in multiple nested loops. For example, suppose the parameter mu shown above was a square matrix.

constants <- list(x = matrix(rnorm(25), 5, 5), N = 5)

code <- nimbleCode({
  mu[1:N, 1:N] <- FORLOOP(alpha + beta * x[1:N, 1:N])
})
mod <- nimbleModel(code, constants)

## Error : Not sure how to expand duplicated index 1:N in code:
## mu[1:N, 1:N]

## Error: Model macro FORLOOP failed.

The structure of this for loop with duplicated indices could be ambiguous in more complex situations, so FORLOOP errors. The solution in this case would simply be to provide different variables for each dimension, even if they are the same value.

constants <- list(x = matrix(rnorm(25), 5, 5), N = 5, K = 5)

code <- nimbleCode({
  mu[1:N, 1:K] <- FORLOOP(alpha + beta * x[1:N, 1:K])
})
mod <- nimbleModel(code, constants)
mod$getCode()

## {
##     for (i_1 in 1:N) {
##         for (i_2 in 1:K) {
##             mu[i_1, i_2] <- alpha + beta * x[i_1, i_2]
##         }
##     }
## }

Writing Your Own Macros

Suppose you wanted to create a macro called MYDIST, which creates a normal distribution with mean 0 and provided standard deviation. The macro would look like this in the NIMBLE model code:

y ~ MYDIST(sdev = 2)

Note that by convention, we define the macro name in all capital letters. All the provided macros in nimbleMacros follow this convention. We want the final, processed output to look like:

y ~ dnorm(0, sd = 2)

Manually creating the macro

A macro is, fundamentally, a specially-formatted R function that converts one line of code to another. For example:

fun <- function(input){
  lhs <- input[[2]]
  sdev <- input[[3]]$sdev
  output <- substitute(LHS ~ dnorm(0, sd = SDEV),
                       list(LHS = lhs, SDEV = sdev))
  output
}

Note that we have to do a bit of clunky code-processing to extract the left-hand-side of the assignment and the argument value. (The use of the [[ extraction with a code object extracts information from the abstract syntax tree representation of an R expression.)

input <- quote(y ~ MYDIST(sdev = 2))
fun(input)

## y ~ dnorm(0, sd = 2)

In practice, nimble expects the macro function converter (fun in the example above) to have two additional arguments: modelInfo and .env. When processing a macro, nimble will provide a named list to the modelInfo argument which contains some additional model information. Most importantly, this list includes constants, which can be used and/or changed by the macro. And, as you can probably guess, nimble provides the R environment in which nimbleModel was called to the .env argument, to help with scoping issues.

In addition to these two arguments, the function also needs to return a particular structure in the output. The output should be a named list with two elements: code, which contains the output code (as in the original function above), and modelInfo which is the list of model information, which we could modify or leave as-is.

fun <- function(input_code, modelInfo, .env){
  LHS <- input_code[[2]]
  sdev <- input_code[[3]]$sdev
  output_code <- substitute(LHS ~ dnorm(0, sd = SDEV),
                       list(LHS = LHS, SDEV = sdev))

  # Modify the constants
  modelInfo$constants$new <- 1
    
  list(code = output_code, modelInfo = modelInfo)
}

input <- quote(y ~ MYDIST(sdev = 2))
fun(input, modelInfo = list(constants = NULL), .env = environment())

## $code
## y ~ dnorm(0, sd = 2)
## 
## $modelInfo
## $modelInfo$constants
## $modelInfo$constants$new
## [1] 1

Finally, nimble needs a way to know that this function is intended to be a macro, and that it is called MYDIST. The final step then is to create a wrapper list with our function inserted as an element called process, and assign the class "model_macro". (In the next section we’ll see that the helper function buildMacro will create this wrapper list for us.)

MYDIST <- list(process = fun)
class(MYDIST) <- "model_macro"

Now, we can test the macro in a NIMBLE model:

code <- nimbleCode({
  y ~ MYDIST(sdev = 2)
})
mod <- nimbleModel(code = code, data = list(y = 1))
mod$getCode()

## {
##     y ~ dnorm(0, sd = 2)
## }

Using `buildMacro` to create the macro

Manually creating the macro is fine, but it gets complicated if we want to extract a lot of separate pieces of information from the code line, e.g., if our macro has many arguments. As an alternative, we can use the buildMacro function included with nimble, which handles some of this code processing for us. The first argument to buildMacro, is a function structured like the one we wrote above fun. The other two arguments use3pieces and unpackArgs determine how the arguments of fun should be structured.

First, suppose both use3pieces and unpackArgs are set to FALSE. In this case, the first argument to fun is the entire original line of code containing the macro, i.e., y ~ MYDIST(sdev = 2). This is identical to how we wrote the function above. In this case, all buildMacro will do is create the required macro list structure and assign the class for us.

MYDIST2 <- buildMacro(fun, use3pieces = FALSE, unpackArgs = FALSE)

code <- nimbleCode({
  y ~ MYDIST2(sdev = 2)
})
mod <- nimbleModel(code = code, data = list(y = 1))
mod$getCode()

## {
##     y ~ dnorm(0, sd = 2)
## }

Now we will set use3pieces = TRUE. In this case, buildMacro will automatically do some initial processing on the code line for us. Specifically, it will identify if the assignment is stochastic (stoch, TRUE if ~ and FALSE if <-), and separate and return the left-hand-side (LHS) and right-hand-side (RHS) of the assignment. With these settings, our function’s first three arguments need to be stoch, LHS, and RHS, in that order (plus we keep the always-required modelInfo and .env).

fun3 <- function(stoch, lhs, rhs, modelInfo, .env){
  sdev <- rhs$sdev
  output_code <- substitute(LHS ~ dnorm(0, sd = SDEV),
                       list(LHS = lhs, SDEV = sdev))

  # Modify the constants
  modelInfo$constants$new <- 1
    
  list(code = output_code, modelInfo = modelInfo)
}

We do a little less code processing this time since buildMacro has already separated LHS and RHS for us. Note that in this simple example we aren’t using stoch because we want our output code to always be stochastic.

MYDIST3 <- buildMacro(fun3, use3pieces = TRUE, unpackArgs = FALSE)

code <- nimbleCode({
  y ~ MYDIST3(sdev = 2)
})
mod <- nimbleModel(code = code, data = list(y = 1))

## Error in (function (stoch, lhs, rhs, modelInfo, .env)  : 
##   unused arguments (LHS = quote(y), RHS = quote(MYDIST3(sdev = 2)))

## Error: Model macro MYDIST3 failed.

mod$getCode()

## {
##     y ~ dnorm(0, sd = 2)
## }

Finally, consider the case where we also set unpackArgs = TRUE. Now, buildMacro will dig into the right-hand-side of the macro and extract the argument values for us (in this case just sdev). Thus, we need to include all these arguments explicitly as arguments in our function.

fun4 <- function(stoch, lhs, sdev, modelInfo, .env){
  output_code <- substitute(LHS ~ dnorm(0, sd = SDEV),
                       list(LHS = lhs, SDEV = sdev))

  # Modify the constants
  modelInfo$constants$new <- 1
    
  list(code = output_code, modelInfo = modelInfo)
}

Even less code processing is required in this case, since buildMacro will split out the sdev argument for us.

MYDIST4 <- buildMacro(fun4, use3pieces = TRUE, unpackArgs = TRUE)

code <- nimbleCode({
  y ~ MYDIST4(sdev = 2)
})
mod <- nimbleModel(code = code, data = list(y = 1))

## Error in (function (stoch, lhs, sdev, modelInfo, .env)  : 
##   unused argument (LHS = quote(y))

## Error: Model macro MYDIST4 failed.

mod$getCode()

## {
##     y ~ dnorm(0, sd = 2)
## }

The choice of arguments will depend on the macro: sometimes the short-cuts enabled by buildMacro will be useful, other times you may need more manual control over how the input code line is processed. See ?buildMacro for additional detailed documentation and examples, including an example when use3pieces = FALSE and unpackArgs = TRUE. You may also find the source code for the macros included in nimbleMacros, found here, a useful resource for writing your own.

Overview of nimbleMacros

Ken Kellner

Introduction

`LM`: A macro for complete linear models

Example logistic regression with `mtcars`

Example linear mixed model using `ChickWeights`

`LINPRED`

Generate a linear predictor from a formula

Factor/categorical covariates

Add a link function

Functions in formulas

Get default priors

Suppress the macro comments

Random effects

Include `LINPRED` in a complete model

`LINPRED_PRIORS`

`FORLOOP`

Writing Your Own Macros

Manually creating the macro

Using `buildMacro` to create the macro

Overview of nimbleMacros

Ken Kellner

Introduction

LM: A macro for complete linear models

Example logistic regression with mtcars

Example linear mixed model using ChickWeights

LINPRED

Generate a linear predictor from a formula

Factor/categorical covariates

Add a link function

Functions in formulas

Get default priors

Suppress the macro comments

Random effects

Include LINPRED in a complete model

LINPRED_PRIORS

FORLOOP

Writing Your Own Macros

Manually creating the macro

Using buildMacro to create the macro

`LM`: A macro for complete linear models

Example logistic regression with `mtcars`

Example linear mixed model using `ChickWeights`

`LINPRED`

Include `LINPRED` in a complete model

`LINPRED_PRIORS`

`FORLOOP`

Using `buildMacro` to create the macro