The “rNeighborQTL” package includes core functions to perform QTL mapping of neighbor effects. Taking conditional genotype probabilities from the “R/qtl” package (Broman et al. 2003), the “scan_neighbor()” calculates neighbor genotypic identity and performs interval mapping of neighbor QTL effects. The neighbor QTL requires spatial information, namely individual positions along the x- and y-axes, in addition to the genotype and phenotype data. See Sato, Takeda & Nagano (2021) for the theoretical background.
First, let us prepare input data using the “R/qtl” package (Broman et al. 2003). Here is an example to import .csv files into a ‘cross’ object and calculate conditional self-genotype probabilities. In this example, we import insect herbivory data on Col x Kas recombinant inbred lines (RILs) of Arabidopsis thaliana (Wilson et al. 2001; Sato, Takeda & Nagano 2021).
<- qtl::read.cross(format="csvs",dir="../inst",
colkas genfile="ColKas_geno.csv",
phefile = "ColKas_pheno.csv",
na.strings = c("_"), estimate.map=TRUE, crosstype = "riself"
)#> --Read the following data:
#> 126 individuals
#> 26 markers
#> 10 phenotypes
#> --Estimating genetic map
#> --Cross type: riself
<- qtl::calc.genoprob(colkas, step=2) colkas_genoprob
Prior to the genome scan, we estimate the ‘scale’ argument. Using linear mixed models implemented in the “gaston” package (Perdry & Dandine-Roulland 2020), the “calc_pve()” computes proportion of phenotypic variation (PVE) by neighbor effects for a series of spatial scales. Based on the PVE, we calculate \(\Delta\)PVE metric and seek the scale \(s\) that gives an argument for the maximum of \(\Delta\)PVE.
library(rNeighborQTL)
<- colkas$pheno[,2]
x <- colkas$pheno[,3]
y <- data.frame(x,y)
smap_colkas
<- quantile(dist(smap_colkas),c(0.1*(1:10)))
s_seq <- calc_pve(genoprobs=colkas_genoprob,
colkas_pve pheno=log(colkas$pheno[,5]+1),
smap=smap_colkas, s_seq=s_seq,
addcovar=as.matrix(colkas$pheno[,7:9])
)#> scale = 2.236
#> scale = 3.162
#> scale = 4.243
#> EM step failed to improve likelihood (this should not happen)
#> scale = 5.099
#> scale = 6
#> scale = 7
#> scale = 7.81
#> scale = 8.602
#> scale = 10.05
#> scale = 15
Similar to Haley-Knott regression (Haley & Knott 1992), the additive and dominance deviation \(a\) and \(d\) are estimated using a linear or quadratic regression on neighbor genotypic identity. The “eff_neighbor()” estimates the coefficients for self and neighbor effects, and plots the results as follows.
<- eff_neighbor(genoprobs=colkas_genoprob,
colkas_eff pheno=log(colkas$pheno[,5]+1),
smap=smap_colkas, scale=7,
addcovar=as.matrix(colkas$pheno[,7:9])
)
Lastly, we perform a genome scan to obtain LOD scores for neighbor QTL effects. The “scan_neighbor()” calculates likelihoods using the estimated QTL effects through the “eff_neighbor()”. The results are drawn by “plot_nei()”.
<- scan_neighbor(genoprobs=colkas_genoprob,
colkas_scan pheno=log(colkas$pheno[,5]+1),
smap=smap_colkas, scale=7,
addcovar=as.matrix(colkas$pheno[,7:9])
)plot_nei(colkas_scan)
In addition to the genome scan, we can perform permutation tests to estimate a genome-wide significance level. Such permutation tests better account data structure, but require much computational time. This is an example code for 3-times permutations.
<- perm_neighbor(genoprobs=colkas_genoprob,
colkas_perm pheno=log(colkas$pheno[,5]+1),
smap=smap_colkas, scale=7,
addcovar=as.matrix(colkas$pheno[,6:8]),
times=3, p_val=c(0.5,0.1)
)
The “scan_neighbor()” also provides LOD scores for self QTL effects. This gives the same results as the Haley-Knott regression of standard QTL mapping.
plot_nei(colkas_scan, type="self")
<- qtl::scanone(colkas_genoprob,
colkas_scanone pheno.col=log(colkas$pheno$holes+1),
addcovar=as.matrix(colkas$pheno[,7:9]),
method="hk")
plot(colkas_scanone)
The “addQTL” argument allows us to include non-focal QTLs as covariates. This option enables composite interval mapping (Jensen et al. 1993) that considers additional QTL effects. Here is an example code using the Col x Kas herbivory data, with the nga8 marker considered a covariate.
<- scan_neighbor(genoprobs=colkas_genoprob,
colkas_cim pheno=log(colkas$pheno[,5]+1),
smap=smap_colkas, scale=7,
addcovar=as.matrix(colkas$pheno[,7:9]),
addQTL="c4_nga8"
)plot_nei(colkas_cim)
For the analysis of epistasis, the “int_neighbor()” calculate LOD score of two-way interactions between a focal marker and the others. Here is an example code for the ‘nga8’ marker in the Col x Kas herbivory data.
<- int_neighbor(genoprobs=colkas_genoprob,
colkas_int pheno=log(colkas$pheno[,5]+1),
smap=smap_colkas, scale=7,
addcovar=as.matrix(colkas$pheno[,7:9]),
addQTL="c4_nga8", intQTL="c4_nga8"
)
plot_nei(colkas_int, type="int")
The “response” argument allows us to analyze “binary” phenotypes as well as “quantitative” traits. This argument calls logistic (mixed) models internally (Faraway 2016; Chen et al. 2016). The “calc_pve()” yields the ratio of phenotypic variation explained (RVE) by neighbor effects as RVEnei =\(\sigma^2_2/\sigma^2_1\) when “binary” traits are analyzed, because the logistic mixed model does not compute \(\sigma^2_e\) (Perdry & Dandine-Roulland 2020). Here is an example code for the analysis of the presence or absence of bolting in Col x Kas RILs.
<- quantile(dist(smap_colkas),c(0.1*(1:10)))
s_seq <- calc_pve(genoprobs=colkas_genoprob,
colkas_pveBin pheno=colkas$pheno[,7],
smap=smap_colkas, s_seq=s_seq,
response="binary", addcovar=as.matrix(colkas$pheno[,8:9]),
fig=TRUE)
#> scale = 2.236
#> scale = 3.162
#> scale = 4.243
#> scale = 5.099
#> scale = 6
#> scale = 7
#> scale = 7.81
#> scale = 8.602
#> scale = 10.05
#> scale = 15
<- scan_neighbor(genoprobs=colkas_genoprob,
colkas_scanBin pheno=colkas$pheno[,7],
smap=smap_colkas, scale=2.24,
addcovar=as.matrix(colkas$pheno[,8:9]),
response="binary")
plot_nei(colkas_scanBin)
The neighbor QTL package is able to handle AB heterozygotes. It also works even when there are only AA or AB genotypes. However, sex chromosomes are not supported currently, and should be excluded before the genome scan. This is a simulation using F2 or backcross lines implemented in the “R/qtl” package.
#F2 lines
set.seed(1234)
data("fake.f2",package="qtl")
<- subset(fake.f2, chr=1:19)
fake_f2 <- cbind(runif(qtl::nind(fake_f2),1,100),runif(qtl::nind(fake_f2),1,100))
smap_f2 <- qtl::calc.genoprob(fake_f2,step=2)
genoprobs_f2 <- quantile(dist(smap_f2),c(0.1*(1:10)))
s_seq
<- sim_nei_qtl(genoprobs_f2, a2=0.5, d2=0.5,
nei_eff smap=smap_f2,
scale=s_seq[1], n_QTL=1
)
<- calc_pve(genoprobs=genoprobs_f2,
pve_f2 pheno=nei_eff$nei_y,
smap=smap_f2, s_seq=s_seq[1:5],
addcovar=as.matrix(cbind(fake_f2$pheno$sex,fake_f2$pheno$pgm)),
fig=FALSE)
#> scale = 19.368
#> scale = 28.245
#> scale = 35.695
#> scale = 42.909
#> scale = 49.961
<- pve_f2[-1,3] - c(0,pve_f2[1:4,3])
deltaPVE <- s_seq[1:5][deltaPVE==max(deltaPVE)]
argmax_s
<- scan_neighbor(genoprobs=genoprobs_f2,
scan_f2 pheno=nei_eff$nei_y,
smap=smap_f2, scale=argmax_s,
addcovar=as.matrix(cbind(fake_f2$pheno$sex,fake_f2$pheno$pgm))
)
plot_nei(scan_f2)
#backcross lines
set.seed(1234)
data("fake.bc",package="qtl")
<- subset(fake.bc, chr=1:19)
fake_bc <- cbind(runif(qtl::nind(fake_bc),1,100),runif(qtl::nind(fake_bc),1,100))
smap_bc <- qtl::calc.genoprob(fake_bc,step=2)
genoprobs_bc <- quantile(dist(smap_bc),c(0.1*(1:10)))
s_seq
<- sim_nei_qtl(genoprobs_bc, a2=0.3, d2=-0.3,
nei_eff smap=smap_bc,
scale=s_seq[1], n_QTL=1)
<- calc_pve(genoprobs=genoprobs_bc,
pve_bc pheno=nei_eff$nei_y,
smap=smap_bc, s_seq=s_seq[1:5],
addcovar=as.matrix(cbind(fake_bc$pheno$sex,fake_bc$pheno$age)),
fig=FALSE)
#> scale = 19.256
#> scale = 28.487
#> scale = 36.266
#> scale = 43.479
#> scale = 50.618
<- pve_bc[-1,3] - c(0,pve_bc[1:4,3])
deltaPVE <- s_seq[1:5][deltaPVE==max(deltaPVE)]
argmax_s
<- scan_neighbor(genoprobs=genoprobs_bc,
scan_bc pheno=nei_eff$nei_y,
smap=smap_bc, scale=argmax_s,
addcovar=as.matrix(cbind(fake_bc$pheno$sex,fake_bc$pheno$age))
)
plot_nei(scan_bc)