Introduction to segtest

library(segtest)

The segtest package contains a suite of functions to test and evaluate segregation distortion in F1 populations of tetraploids. We allow for various types of polyploids (auto, allo, and segmental) without having the user specify the type of polyploid they are studying. We also account for genotype uncertainty through the use of genotype likelihoods, which can be obtained through many genotyping programs (like updog, fitpoly, and polyRAD). Details of these methods may be found in Gerard et al. (2024). The main functions are:

Here, we will demonstrate some of our functions.

Offspring genotype frequencies

We can obtain offspring genotype frequencies via offspring_gf_2() and offspring_gf_3(). These are two different parameterizations of the same model for meiosis. For offspring_gf_3(), you insert the following parameters:

Let’s generate some example genotype frequencies. You can play around with the parameter values yourself.

gf <- offspring_gf_3(
  tau = 1, 
  beta = 1/6, 
  gamma1 = 1/3,
  gamma2 = 1/3, 
  p1 = 1,
  p2 = 2)
plot(
  x = 0:4,
  y = gf,
  type = "h",
  xlab = "Genotype", 
  ylab = "Frequency",
  ylim = c(0, 1))

A probability mass function of genotypes of a tetraploid F1 population

The offspring_gf_3() function is safer to use because there is a dependence between the preferential pairing parameter and the double reduction rate that bounds these values in offspring_gf_2(), and so in the two-parameter model you might accidentally choose values that are impossible. I did not set up checks for these values because the bounds depend on the maximum rate of double reduction, which can vary significantly. Please see Gerard et al. (2024) for details.

When the null is true

We’ll first simulate some data where the null of no segregation distortion is true.

set.seed(1)
g1 <- 1
g2 <- 2
alpha <- 1/6
xi1 <- 1/3
xi2 <- 1/3
n <- 20
rd <- 10
x <- simf1g(
  n = n, 
  g1 = g1, 
  g2 = g2, 
  alpha = alpha, 
  xi1 = xi1, 
  xi2 = xi2)
gl <- simf1gl(
  n = n, 
  rd = rd, 
  g1 = g1,
  g2 = g2, 
  alpha = alpha, 
  xi1 = xi1,
  xi2 = xi2)

The LRT has a large \(p\)-value, which is appropriate since there is no segregation distortion.

lout <- lrt_men_g4(x = x, g1 = g1, g2 = g2)
lout$p_value
#> [1] 0.5698342
lout_gl <- lrt_men_gl4(gl = gl, g1 = g1, g2 = g2)
lout_gl$p_value
#> [1] 0.6369666

When the alternative is true

When we simulate data where the alternative is true, we get a very small \(p\)-value.

x <- c(stats::rmultinom(n = 1, size = 20, prob = rep(1/5, 5)))
lout <- lrt_men_g4(x = x, g1 = g1, g2 = g2)
lout$p_value
#> [1] 3.5306e-07

References

Gerard D, Thakkar M, & Ferrão LFV (2024). “Tests for segregation distortion in tetraploid F1 populations.” bioRxiv. doi:10.1101/2024.02.07.579361.