An introduction to the obfuscatoR package

Erlend Dancke Sandorf1

Caspar Chorus2

Sander van Cranenburgh3

Introduction

Chorus et al. (2021) puts forward the idea that sometimes when people make choices they wish to hide their true motivation from a potential onlooker. The obfuscatoR package allows researchers to easily create and customize “obfuscation” games. These games are specifically designed to test the obfuscation hypothesis, i.e. when properly incentivized are people able to obfuscate.

Let us consider a decision maker who has to decide on a course of action, but he is being observed by an onlooker4. The decision maker seeks to choose an action that is in line with his underlying motivation or preferences5, but at the same time, he does not want the onlooker to know his true motivation for the course of action he chose. Instead, he seeks to take a course of action that is in line with his motivation, but leaves the onlooker as clueless as possible as to what that motivation might be - he obfuscates. For a full discussion of the technical details of the model and more detailed examples, we refer to Chorus et al. (2021).

Let us assume that the set of rules (motivations above), \(R\), and the set of possible actions, \(A\), are known to both the decision maker and the observer. We define \(r_k\) as the \(k^{\mathrm{th}}\) element in \(R\) and \(a_j\) as the \(j^{\mathrm{th}}\) element in \(A\). Using the notions of information entropy (Shannon 1948) and Bayesian updating, we can formulate the observer’s best guess as to what motivates the decision maker as the posterior probability in . This is the probability of a rule conditional on having observed an action.

\[\begin{equation}\label{eqn-posterior} \Pr(r_k|a_j) = \frac{\Pr(a_j|r_k)\Pr(r_k)}{\sum_{k=1}^{K}\left[\Pr(a_j|r_k)\Pr(r_k)\right]} \end{equation}\]

where the vector of prior probabilities \(\Pr(r_k)\) are assume flat and equal to \(1/K\) with \(K\) being equal to the number of rules. In a situation where the observer can observe multiple actions by the same individual, then these prior probabilities are no longer equal. For example, if she observes two actions by the same decision maker, then the posterior after the first action becomes the prior when calculating the entropy of the second action. \(\Pr(a_j|r_k)\) is calculated differently depending on whether an action is obligated under a given rule or simply permitted. These are sometimes referred to as strong and weak rules and are calculated as in and respectively.

\[\begin{equation}\label{eqn-strong-rule} \Pr(a_j|r_k) = \left\{\begin{array}{cl} 1 & \text{if } a_j \text{is obliged under } r_k\\ 0 & \text{otherwise} \\ \end{array}\right. \end{equation}\]

\[\begin{equation}\label{eqn-weak-rule} \Pr(a_j|r_k) = \left\{\begin{array}{cl} 1/L & \text{if } a_j \text{is permitted under } r_k\\ 0 & \text{otherwise} \\ \end{array}\right. \end{equation}\]

where \(L\) is the size of the subset of permitted actions under \(r_k\). As such, according to the observer is updating her beliefs about which rule governs a decision maker’s actions each time she observes an action. An obfuscating decision maker seeks to take an action, consistent with his rule, to leave the observer as clueless as possible as to which rule governs his actions. This is quantified in terms of Shannon’s entropy. Specifically, the decision maker seeks to maximize :

\[\begin{equation}\label{eqn-entropy} \mathrm{H}_j = -\sum_{k=1}^{K}\left[\Pr(r_k|a_j)\log(\Pr(r_k|a_j))\right] \end{equation}\]

The rest of this vignette sets out to describe how to use the package to generate simple and more complex versions of the obfuscation game. Once we have grasped the simple mechanics of the package, we will show you how to introduce additional restrictions that are useful to raise or lower the difficulty of the game for both decision makers and observers.

Simple obfuscation designs

Generating a design

First, let us create a very simple design. At a minimum, we need to specify the number of possible rules and actions. We specify this in a list of design options design_opt_input as follows (click here for a full list of options):

design_opt_input <- list(rules = 4,
                         actions = 5)

Above, we specified that our design consists of 4 possible rules governing a decision maker’s actions, and that there are 5 possible actions that he can take. To create a design, we pass the list of design options to the generate_designs() function.

design <- generate_designs(design_opt_input)

Our design is a matrix with rows equal to the number of rules and columns equal to the number of possible actions. Throughout, the design will also be referred to as a rules-action matrix. We can print generated rules and action matrix using the print_design() function.

print_design(design)
#> The rules-action matrix 
#> 
#> Rows: Rules 
#> Columns: Actions 
#> 
#>    A1 A2 A3 A4 A5
#> R1  0  0 -1  0 -1
#> R2 -1 -1  0  0  0
#> R3 -1  0 -1  0  0
#> R4  0  0 -1 -1 -1
#> 
#> The considered rule is 1.

The design is generated conditional on a given rule referred to as the considered rule. The considered rule is selected as part of the design generation process, and cannot be set by the analyst. It is possible to print additional information about the design generation process by setting print_all = TRUE. This will provide information on the number of iterations and whether all design conditions were met.

print_design(design, print_all = TRUE)
#> The rules-action matrix 
#> 
#> Rows: Rules 
#> Columns: Actions 
#> 
#>    A1 A2 A3 A4 A5
#> R1  0  0 -1  0 -1
#> R2 -1 -1  0  0  0
#> R3 -1  0 -1  0  0
#> R4  0  0 -1 -1 -1
#> 
#> The considered rule is 1.
#> 
#> The design was found in 3 iterations. 
#> 
#> All the design conditions were met: TRUE

Note: It is possible to extract the vector of design conditions with extract_attr(design, "design_conditions").

Controlling replicability

The default behavior of the obfuscatoR package is to set a random seed each time you generate a design. This is to minimize the possibility of always generating the same designs. However, sometimes it may be required to generate the same design, e.g. to ensure replicability or for teaching purposes. It is possible to set the seed for the random number generator using the seed option. For example: design_opt_input <- list(seed = 10). Here, we have set the initial seed to 10. If we are generating multiple designs, then the seed will increment by 1 for each design. That is, if the initial seed is 10 and we generate two designs, then the first design will be generated with seed set to 10 and the second with seed set to 11.

Calculate the entropy of each action

The obfuscatoR package also includes a set of functions to evaluate the designs and calculate the entropy of a design. To identify the action that would leave the observer as clueless as possible as to which rule governs the decision maker’s choice, we need to calculate the entropy of each action. We can do this using the calculate_entropy() function.

entropy <- calculate_entropy(design)

An obfuscating decision maker will choose an action, conditional on his rule, that will leave the observer as clueless as possible, i.e. with the highest entropy. We can print the the results of the entropy calculation using the print_entropy() function.

print_entropy(entropy)
#> Shannon's entropy -- Design  1 
#> 
#>    A1    A2    A3    A4    A5 
#> 0.292 0.469 0.000 0.477 0.301

To calculate the entropy of the action we also need to calculate the probability of an action conditional on a rule and the probability of a rule conditional on an action. We can print the results of these calculations by setting print_all = TRUE.

print_entropy(entropy, print_all = TRUE)
#> Shannon's entropy -- Design  1 
#> 
#>    A1    A2    A3    A4    A5 
#> 0.292 0.469 0.000 0.477 0.301 
#> 
#> 
#> The rules-action matrix 
#> 
#> Rows: Rules 
#> Columns: Actions 
#> 
#>    A1 A2 A3 A4 A5
#> R1  0  0 -1  0 -1
#> R2 -1 -1  0  0  0
#> R3 -1  0 -1  0  0
#> R4  0  0 -1 -1 -1
#> 
#> The considered rule is 1.
#> 
#> The vector of prior probabilities 
#> 
#>   R1   R2   R3   R4 
#> 0.25 0.25 0.25 0.25 
#> 
#> The probability of an action conditional on a rule 
#> 
#>       A1    A2    A3    A4    A5
#> R1 0.333 0.333 0.000 0.333 0.000
#> R2 0.000 0.000 0.333 0.333 0.333
#> R3 0.000 0.333 0.000 0.333 0.333
#> R4 0.500 0.500 0.000 0.000 0.000
#> 
#> The probability of a rule conditional on observing an action, i.e. the posterior 
#> 
#>     A1    A2 A3    A4  A5
#> R1 0.4 0.286  0 0.333 0.0
#> R2 0.0 0.000  1 0.333 0.5
#> R3 0.0 0.286  0 0.333 0.5
#> R4 0.6 0.429  0 0.000 0.0

When priors are not flat

It is possible to supply a vector of prior probabilities when calculating the entropy measure. If no vector of priors is supplied, we assume flat priors, i.e. \(1/R\), where \(R\) is the number of rules.

prior_probs <- c(0.2, 0.3, 0.15, 0.35)
entropy <- calculate_entropy(design, priors = prior_probs)
print_entropy(entropy, print_all = TRUE)
#> Shannon's entropy -- Design  1 
#> 
#>    A1    A2    A3    A4    A5 
#> 0.256 0.411 0.000 0.459 0.276 
#> 
#> 
#> The rules-action matrix 
#> 
#> Rows: Rules 
#> Columns: Actions 
#> 
#>    A1 A2 A3 A4 A5
#> R1  0  0 -1  0 -1
#> R2 -1 -1  0  0  0
#> R3 -1  0 -1  0  0
#> R4  0  0 -1 -1 -1
#> 
#> The considered rule is 1.
#> 
#> The vector of prior probabilities 
#> 
#>   R1   R2   R3   R4 
#> 0.20 0.30 0.15 0.35 
#> 
#> The probability of an action conditional on a rule 
#> 
#>       A1    A2    A3    A4    A5
#> R1 0.333 0.333 0.000 0.333 0.000
#> R2 0.000 0.000 0.333 0.333 0.333
#> R3 0.000 0.333 0.000 0.333 0.333
#> R4 0.500 0.500 0.000 0.000 0.000
#> 
#> The probability of a rule conditional on observing an action, i.e. the posterior 
#> 
#>       A1    A2 A3    A4    A5
#> R1 0.276 0.229  0 0.308 0.000
#> R2 0.000 0.000  1 0.462 0.667
#> R3 0.000 0.171  0 0.231 0.333
#> R4 0.724 0.600  0 0.000 0.000

Restricted designs

To make what we consider valid designs, we have implemented a set of restrictions, some of which can be changed by the user. This list is ordered to match the output vector from: extract_attr(design, "design_conditions").

  1. Each design is generated based on a considered rule. The considered rule cannot contain an obligated action. This is because it would force the decision maker to choose the obligated action and the observer would be able to guess the rule with a high degree of accuracy, i.e. the entropy of the action is very low.
  2. No action included in the design can be forbidden by every rule. Said another way, each action has to be permitted by at least one rule.
  3. Actions that are permitted under the considered rule has to fit a minimum number of rules. The default is 0, i.e. without being set by the user, this restriction is not binding.
  4. A design cannot contain duplicate actions.
  5. The action that maximizes entropy has to be permitted by the considered rule.
  6. The action that maximizes entropy has to have the lowest posterior probability.
  7. To make the game easier for both decision makers and observers, the analyst can specify a spread for the entropy of actions. The larger the spread, the easier it should be to identify the entropy maximizing action.

In addition to the 7 conditions above, there is an 8th condition not returned by extract_attr(design, "design_conditions").

  1. There can only be one entropy maximizing action. That is, we cannot have a tie between two different actions.

Complex obfuscation designs

The designs outlined above are fairly simple. We have placed no restrictions on the design with respect to the number of rules that are allowed nor have we included rules with obligatory actions. The obfuscatoR package includes several options that allow us to create designs that vary with respect to the maximum and minimum allowable actions per rule, the number of rules with obligatory actions and even ensure a given spread of the entropy among the actions available to decision makers. Let us take a closer look at the various options that are available to us.

Obligatory actions

We can specify the number of obligatory actions through the use of the option obligatory. Let us continue to work with the design from above, but this time we will specify that one of the rules has an obligatory action.

design_opt_input <- list(rules = 4,
                         actions = 5,
                         obligatory = 1)

design <- generate_designs(design_opt_input)

print_design(design, FALSE)
#> The rules-action matrix 
#> 
#> Rows: Rules 
#> Columns: Actions 
#> 
#>    A1 A2 A3 A4 A5
#> R1  0 -1 -1  0 -1
#> R2 -1  1 -1 -1 -1
#> R3  0 -1 -1 -1  0
#> R4  0  0  0  0 -1
#> 
#> The considered rule is 4.

The rule with an obligatory action is the row with only -1 and 1 in the matrix above.

Minimum and maximum number of available actions for the considered rule

As the size of our designs become larger, in order to keep the complexity of the choice at a reasonable level, we might want to specify a minimum and maximum number of allowed actions under the considered rule. We can easily do this through the options min and max.

design_opt_input <- list(rules = 4,
                         actions = 5,
                         min = 2, 
                         max = 3,
                         obligatory = 1)

design <- generate_designs(design_opt_input)

print_design(design, FALSE)
#> The rules-action matrix 
#> 
#> Rows: Rules 
#> Columns: Actions 
#> 
#>    A1 A2 A3 A4 A5
#> R1  0  0  0 -1 -1
#> R2  0  0 -1  0 -1
#> R3 -1  0 -1  0 -1
#> R4 -1 -1 -1 -1  1
#> 
#> The considered rule is 1.

Minimum number of rules fitting each permitted action conditional on the rule

To vary the difficulty of the game for the observer we can specify the minimum number of rules that each permitted action fits conditional on the observed rule. We specify the minimum number of rules using the min_fit option. For example, if we are considering a game with 4 rules and 5 actions and the considered rule permits the decision maker to choose one of two actions, then setting min_fit = 2 means that each of the two actions fit at least two rules including the considered rule.

design_opt_input <- list(rules = 4,
                         actions = 5,
                         min_fit = 2,
                         obligatory = 1)

design <- generate_designs(design_opt_input)

print_design(design, FALSE)
#> The rules-action matrix 
#> 
#> Rows: Rules 
#> Columns: Actions 
#> 
#>    A1 A2 A3 A4 A5
#> R1  0  0  0 -1 -1
#> R2  0 -1 -1 -1  0
#> R3 -1 -1 -1  1 -1
#> R4 -1 -1  0 -1  0
#> 
#> The considered rule is 4.

Spread of entropy

Given the trial and error nature of searching for obfuscation designs, by chance, we may end up in a situation where the difference between the entropy maximizing action and the second best is very small. This will make it very difficult for both decision makers and observers to identify the entropy maximizing action. The generate_designs() function includes an option that allow us to specify the “spread” of entropy.

design_opt_input <- list(rules = 4,
                         actions = 5,
                         considered_rule = 3,
                         sd_entropy = 0.15)

design <- generate_designs(design_opt_input)

print_design(design, FALSE)
#> The rules-action matrix 
#> 
#> Rows: Rules 
#> Columns: Actions 
#> 
#>    A1 A2 A3 A4 A5
#> R1  0  0  0  0 -1
#> R2  0  0 -1  0  0
#> R3 -1  0 -1 -1 -1
#> R4  0 -1  0 -1 -1
#> 
#> The considered rule is 2.

Compared to the designs above, we see that we have a much larger spread of entropy for each action.

entropy <- calculate_entropy(design)

print_entropy(entropy)
#> Shannon's entropy -- Design  1 
#> 
#>    A1    A2    A3    A4    A5 
#> 0.452 0.377 0.276 0.301 0.000

Multiple designs

Above, we have focused on one shot games, i.e. we have generated a single design. Often, researchers may want to play repeated games. The obfuscatoR package makes it easy to create multiple designs. Let us create a series of 2 designs by setting designs = 2. We are only using 2 designs here to save space when printing the output.

design_opt_input <- list(rules = 4,
                         actions = 5,
                         considered_rule = 3,
                         designs = 2)

design <- generate_designs(design_opt_input)

print_design(design, TRUE)
#> The rules-action matrix 
#> 
#> Rows: Rules 
#> Columns: Actions 
#> 
#>    A1 A2 A3 A4 A5
#> R1  0  0 -1 -1 -1
#> R2 -1 -1 -1  0 -1
#> R3  0 -1 -1 -1  0
#> R4  0  0  0  0 -1
#> 
#> The considered rule is 3.
#> 
#> The design was found in 3 iterations. 
#> 
#> All the design conditions were met: TRUE
#> 
#>    A1 A2 A3 A4 A5
#> R1  0 -1 -1 -1  0
#> R2 -1  0 -1 -1  0
#> R3 -1  0 -1  0  0
#> R4 -1 -1  0  0 -1
#> 
#> The considered rule is 2.
#> 
#> The design was found in 4 iterations. 
#> 
#> All the design conditions were met: TRUE

To calculate the entropy associated with each action in the designs, we can run the calculate entropy function, but this time we supply a list of designs.

entropy <- calculate_entropy(design)

print_entropy(entropy)
#> Shannon's entropy -- Design  1 
#> 
#>    A1    A2    A3    A4    A5 
#> 0.458 0.276 0.000 0.217 0.000 
#> 
#> 
#> Shannon's entropy -- Design  2 
#> 
#>    A1    A2    A3    A4    A5 
#> 0.000 0.292 0.000 0.292 0.470

If you wish to print all the information about all designs, as above, you can set print_all = TRUE as shown below. We leave the user to run that code on their own machine given the rather large output.

print_entropy(entropy, print_all = TRUE)

List of design options

We provide the full list of design options and defaults in the table below.

Option Description Default Must be specified
rules The number of rules NULL Yes
actions The number of actions NULL Yes
min Minimum number of actions available for the considered rule NA No
max Maximum number of actions available for the considered rule NA No
min_fit Minimum number of rules fitting each permitted action conditional on the rule 0 No
obligatory Number of rules with obligatory actions 0 No
sd_entropy Specifies the standard deviation of the entropy values NA No
designs Number of designs to generate 1 No
max_iter Maximum number of iterations before stopping the search for designs 1e5 No
seed A seed for the random number generator NA No

A cautionary note

The search for new designs is one of trial and error. If you set too many or too tight restrictions on your designs, you may not be able to find valid designs in a reasonable time and the search algorithm will continue until it is stopped.

It is good practice to manually inspect all designs prior to use to ensure that they are indeed of the form you want.

Saving the designs

Once you have created your designs, you might want to save them. We can do this using the save_design() function. The function stores the designs in a .csv file or multiple .csv files if you have generated multiple designs. The function will automatically generate a name of the form “FILE-NAME-rule-X-design-I.csv”, where X is the considered rule used to generate the design, and I is the design number.

save_design(design, "my_designs")

Calculating the payouts

The obfuscatoR package also contains a set of functions that can aid the analyst in determining the expected payout to observers and decision makers from any given design. The participants are incentivized such that it is always in the best interest of the decision maker to choose the entropy maximizing action, and if succeeding in doing so, the observer should avoid guessing the rule.

Payout to the observer

The expected payout to the observer consists of two parts: i) The expected payout from guessing, and ii) the expected payout from not guessing. The first expectation depends on the posterior probabilities calculated in , and is calculated using :

\[\begin{equation}\label{eqn-payout-obs} \mathrm{E}\left[P\right] = \mathrm{argmax}\left\{\Pr(r_k|a_j)\right\}\pi \end{equation}\]

where \(\mathrm{argmax}\left\{\Pr(r_k|a_j)\right\}\) is the maximum posterior probability that a specific rule underlies a decision maker’s action, and \(\pi^\mathrm{G}\) is the payout from guessing correctly. If the observer guesses incorrectly, she receives nothing. If the observer refrains from guessing, she will receive \(\pi^\mathrm{NG}\) with certainty.

Payout to the decision maker

The expected payout to the decision maker depends on whether or not the observer chooses to guess. If the decision maker is successful in keeping the observer as clueless as possible as to which rule governs his actions, i.e. the observer refrains from guessing, then the decision maker receives \(\phi\). If the observer decides to guess, the decision maker receives nothing.

The probability that an observer tries to guess is a function of the difference between the expected payout from guessing, i.e. , and the payout if she refrains from guessing as in .

\[\begin{equation}\label{eqn-probabilistic} \Pr(\mathrm{G}) = \frac{1}{1 + \exp(-(\mathrm{E}\left[P\right] - \pi^\mathrm{NG}))} \end{equation}\]

Alternatively, we can treat the decision to guess as deterministic and use an indicator function:

\[\begin{equation}\label{eqn-deterministic} \Pr(\mathrm{G}) = \left\{\begin{array}{cl} 1 & \mathrm{if } \mathrm{E}\left[P\right] > \pi^\mathrm{NG} \\ 0 & \mathrm{if } \mathrm{E}\left[P\right] \leq \pi^\mathrm{NG}\\ \end{array} \right. \end{equation}\]

Using either or , we can calculate the expected payout to the decision maker using :

\[\begin{equation}\label{eqn-payout-dm} \mathrm{E}\left[P\right] = (1 - \Pr(\mathrm{G})) * \phi \end{equation}\]

An example

Let us create a simple design with 4 rules and 5 possible actions, and calculate the entropy of each action. Now, we can use the calculate_payouts function to calculate the expected payout to both the observer and decision maker. Please note that the only consequential payouts to the decision maker are for those actions permitted by the considered rule. Below, we have set the option deterministic = FALSE, this means that the probabilities of the observer guessing are calculated using . If deterministic = TRUE, then the probabilities are calculated using .

design_opt_input <- list(rules = 4,
                         actions = 5)

design <- generate_designs(design_opt_input)
entropy <- calculate_entropy(design)

payout <- calculate_payouts(entropy,
                            pay_obs = 10,
                            pay_no_guess = 5,
                            pay_dm = 5,
                            deterministic = FALSE)

The expected payout to the observer from guessing is calculated for every action based on the highest conditional probability for that action, i.e. what is the expected payout from guessing if she observes any given action. For the decision maker it is the expected payout from choosing any given action. Notice that the expected payout from choosing an action that is prohibited by the considered rule is zero. We can print the calculated probabilities of guessing by setting print_all = TRUE.

print_payout(payout, print_all = TRUE)
#> Payout to the observer -- Design  1 
#> 
#> E[Pay|1] E[Pay|2] E[Pay|3] E[Pay|4] E[Pay|5] 
#>    10.00     6.00     4.29     3.75    10.00 
#> 
#> 
#> Payout to the decision maker -- Design  1 
#> 
#> E[Pay|1] E[Pay|2] E[Pay|3] E[Pay|4] E[Pay|5] 
#>     0.00     1.34     0.00     3.89     0.00 
#> 
#> 
#> Probabilities of guessing -- Design  1 
#> 
#> Pr[G|1] Pr[G|2] Pr[G|3] Pr[G|4] Pr[G|5] 
#>   0.993   0.731   0.329   0.223   0.993

References

Chorus, Caspar, Sander van Cranenburgh, Aemiro Melkamu Daniel, Erlend Dancke Sandorf, Anae Sobhani, and Teodóra Szép. 2021. “Obfuscation Maximization-Based Decision-Making: Theory, Methodology and First Empirical Evidence.” Mathematical Social Sciences 109: 28–44. https://doi.org/https://doi.org/10.1016/j.mathsocsci.2020.10.002.
Shannon, C E. 1948. “A Mathematial Theory of Communication.” Bell System Technical Journal 16: 158–74.

  1. University of Stirling, Stirling Management School, Economics Division, ↩︎

  2. Delft University of Technology, Department of Engineering Systems and Services↩︎

  3. Delft University of Technology, Department of Engineering Systems and Services↩︎

  4. As in Chorus et al. (2021), we refer to the decision maker as he and the observer/onlooker as she.↩︎

  5. Note, that when we discuss the game, we refer to these motivations and preferences as rules.↩︎