In the previous vignette we described the basic features available in the climwin
package. Below, we will look in more detail at more advanced features available to users. We will cover:
Testing for climate thresholds.
Dealing with data records that encompass multiple years.
Carrying out spatial replication.
Fitting more complex curves to climate window data using weightwin
.
Many studies, may be interested in testing climate windows using climatic thresholds. When testing climatic thresholds, we assume that a biological response is driven by the total numbers of days that surpass a particular climatic value. For example, seed germination may be influenced by the number of days over 30 degrees. Alternatively, temperature may only influence organism survival when temperatures fall below freezing. A common example of such a climate threshold is the using of growing degree days in plant studies.
Statistics like these can be achieved in slidingwin
using the three parameters 'upper', 'lower' and, 'binary'. When a value is provided for the parameter 'upper', slidingwin
will create a new climate dataset where all values equal to or below this threshold are set at 0. Similarly, when a value is set for 'lower', all values equal to or above this threshold will be set at 0. When values are provided for both 'upper' and 'lower', all values that fall between these two threshold will be set at 0.
upper = 30
Date | Original temperature | Threshold temperature |
---|---|---|
01/06/2015 | 25.9 | 0 |
02/06/2015 | 24.0 | 0 |
03/06/2015 | 32.5 | 32.5 |
04/06/2014 | 28.1 | 0 |
05/06/2014 | 30.5 | 30.5 |
06/06/2014 | 30.0 | 0 |
07/06/2014 | 31.2 | 31.2 |
08/06/2014 | 27.0 | 0 |
… | … | … |
upper = 30
lower = 25
Date | Original temperature | Threshold temperature |
---|---|---|
01/06/2015 | 25.9 | 25.9 |
02/06/2015 | 24.0 | 0 |
03/06/2015 | 32.5 | 0 |
04/06/2014 | 28.1 | 28.1 |
05/06/2014 | 30.5 | 0 |
06/06/2014 | 30.0 | 0 |
07/06/2014 | 31.2 | 0 |
08/06/2014 | 27.0 | 27.0 |
… | … | … |
In some circumstances we may assume that all values past the climatic threshold will have an equally large impact of the biological response. In this case, we would set 'binary' to TRUE so that all non-zero values are set at 1. By default however, 'binary' will be set at FALSE, so that all values past the climatic threshold keep their original value.
upper = 30
binary = TRUE
Date | Original temperature | Threshold temperature |
---|---|---|
01/06/2015 | 25.9 | 0 |
02/06/2015 | 24.0 | 0 |
03/06/2015 | 32.5 | 1 |
04/06/2014 | 28.1 | 0 |
05/06/2014 | 30.5 | 1 |
06/06/2014 | 30.0 | 0 |
07/06/2014 | 31.2 | 1 |
08/06/2014 | 27.0 | 0 |
… | … | … |
Below we will provide a worked example using the Mass
and MassClimate
dataset. In this example, lets imagine we are interested in testing the impact of the number of days above freezing on our mass response variable. To do this we would set both our 'upper' and 'binary' parameters.
upper = 0
binary = TRUE
As we are interested in measuring the number of days above freezing, we set our stat parameter to 'sum'. Otherwise, we used model parameter values identical to our earlier vignette.
library(climwin)
MassWin <- slidingwin(xvar = list(Temp = MassClimate$Temp),
cdate = MassClimate$Date,
bdate = Mass$Date,
baseline = lm(Mass ~ 1, data = Mass),
cinterval = "day",
range = c(150, 0),
upper = 0, binary = TRUE,
type = "absolute", refday = c(20, 05),
stat = "sum",
func = "lin")
When we examine the best model data, we can see that our climate data is now count data.
head(MassWin[[1]]$BestModelData)
Yvar | climate |
---|---|
140 | 0 |
138 | 0 |
136 | 1 |
135 | 2 |
134 | 0 |
134 | 0 |
Many long-term datasets that will be suitable for climwin
are likely to be measured during Northern hemisphere spring/summer (e.g. breeding data). These biological records are measured during the middle of the year, meaning that biological records can be easily grouped by year. Yet in other circumstances biological measurements will fall across two years, particularly in Southern hemisphere species where spring/summer falls across the new year period [e.g., 1].
This can cause issues when fitting 'absolute' climate windows. As described in the introductory vignette, 'absolute' windows will use a set reference day for all biological records. Where biological measurements cross two years however, measurements from the same season can be split up. In the table below, where a reference day of November 1st is used, all those measurements taken at the start of the breeding season are given a date of November 1st 2014 while all values following the new year are set at November 1st 2015. This is obviously unrealistic, as biological measurements in January 2015 cannot be impacted by climatic conditions that occured 11 months later.
Date | Reference Date |
---|---|
05/11/2014 | 01/11/2014 |
10/11/2014 | 01/11/2014 |
01/12/2014 | 01/11/2014 |
12/12/2014 | 01/11/2014 |
01/01/2015 | 01/11/2015 |
07/01/2015 | 01/11/2015 |
As a solution, climwin
includes a 'cohort' parameter that allows users to specify which biological measurements should be grouped together (e.g. when they are from same breeding season). Each biological record should be given a cohort level (see below), which is taken into account when setting the reference day for climate window analyses.
Date | Cohort | Reference Date |
---|---|---|
05/11/2014 | 2014 | 01/11/2014 |
10/11/2014 | 2014 | 01/11/2014 |
01/12/2014 | 2014 | 01/11/2014 |
12/12/2014 | 2014 | 01/11/2014 |
01/01/2015 | 2014 | 01/11/2014 |
07/01/2015 | 2014 | 01/11/2014 |
To detect climate signals using climwin
can often require large amounts of data, particularly if the relationship between climate and biological response is weak [2]. To obtain the required data through temporal replication can require the collection of data over multiple years, often decades; however, spatial replication may also allow users to expand their sample size over a shorter period by collecting data from multiple sites/populations.
Using spatial replication assumes that the relationship between the biological response and climatic predictor is consistent across the different measured populations. Where this assumption is valid, spatial replication can help expand the amount of data available for climwin
analyses.
Spatially replicated data can be analysed using the slidingwin
function with the addition of the 'spatial' parameter. As with regular slidingwin
analysis, analysis with spatial replication requires a separate biological and climate dataset. However, these datasets should now contain an additional variable which specifies the site at which biological and climate data was collected. Below, we have called this parameter 'SiteID'.
Date | Mass (g) | SiteID |
---|---|---|
04/06/2015 | 120 | A |
05/06/2015 | 123 | A |
07/06/2015 | 110 | B |
07/06/2015 | 140 | A |
06/06/2015 | 138 | B |
… | … | … |
Date | Temperature | SiteID |
---|---|---|
01/06/2015 | 15 | A |
02/06/2015 | 16 | A |
03/06/2015 | 12 | A |
04/06/2015 | 18 | A |
05/06/2015 | 20 | A |
06/06/2015 | 23 | A |
07/06/2015 | 21 | A |
01/06/2015 | 10 | B |
02/06/2015 | 12 | B |
03/06/2015 | 9 | B |
04/06/2015 | 5 | B |
05/06/2015 | 13 | B |
06/06/2015 | 10 | B |
07/06/2015 | 11 | B |
… | … | … |
NOTE: The climate dataset for spatially replicated climwin
analysis will often include duplication of dates. In a regular climwin
analysis this will lead to errors.
With these new datasets, we can carry out a slidingwin
analysis with the addition of a 'spatial' parameter.
MassWin <- slidingwin(xvar = list(Temp = Climate$Temp),
cdate = Climate$Date,
bdate = Biol$Date,
baseline = lm(Mass ~ 1, data = Biol),
cinterval = "day",
range = c(150, 0),
type = "absolute", refday = c(20, 05),
stat = "mean",
func = "lin", spatial = list(Biol$SiteID, Climate$SiteID))
The 'spatial' parameter is a list item that includes the SiteID variable for the biological and climate datasets respectively. When slidingwin
fits individual climate windows, climate data will be subset so that each biological record will be matched with the corresponding climate data.
weightwin
functionWhen we run regular slidingwin
analyses we assume that all days within the climate window are evenly weighted. While this is often a convenient assumption, this may be biologically unrealistic as we create a strict cut-off for when climate data is considered (see below).
In certain cases we may be interested in looking for climate windows where the importance of climate decays slowly over time. The function weightwin
allows users to fit either Weibull (below left) and generalised extreme value (GEV; below right) weight distributions to climate data.
Instead of varying the start and end date of climate windows like slidingwin
, weightwin
instead uses an optimisation function to vary the shape, scale and location of either of these weight functions. Each weight function is then used to weight the climate data, which is then used to produce a climate model and delta AICc value. Therefore, although the method of optimising climate data is different, the ultimate output (i.e. \(\Delta AICc\) of a climate window compared to a null model) is the same.
weightwin
can often be useful to use with climate data where we have already identified a climate window using slidingwin
. Here, we will use weightwin
to further investigate the Mass
and MassClimate
data included with the climwin
package.
The basic parameters in weightwin
are the same as slidingwin
(though note the abscence of the 'stat' parameter). In addition however, we must designate which weight distribution we want to use. In this case we consider a Weibull function.
weightfunc = "W"
Next, we must set the location, scale and shape values for the starting distribution that will be used to begin the optimisation procedure. The default values (3, 0.2, 0) are often appropriate for fitting Weibull distributions. However, you can explore different parameter values using the explore
function.
weight <- weightwin(xvar = list(Temp = MassClimate$Temp), cdate = MassClimate$Date,
bdate = Mass$Date,
baseline = lm(Mass ~ 1, data = Mass),
range = c(150, 0),
func = "lin", type = "absolute",
refday = c(20, 5),
weightfunc = "W", cinterval = "day",
par = c(3, 0.2, 0))
As part of the weightwin
function a plot will be generated showing the progress of the optimisation function. Most of this information is useful for assessing the effectiveness of the optimisation function. For our purposes however, we will focus only on the final weighted window function (top left).
In the above plot, we can see that the importance of temperature declines rapidly as we near May 20th. However, temperature later in time declines less rapidly. We can extract the \(\Delta AICc\) value for this weight function below.
weight$WeightedOutput$deltaAICc
If we compare this value to the delta AICc obtained from the slidingwin
function we can see that the weightwin
function is better able to explain variation in our mass parameter.
slidingwin |
weightwin |
---|---|
-64.81 | -68.22 |
weightwin
can provide greater detail on the relationship between climate and the biological response, such as the occurrence of exponential functions. Additionally, by using more diverse weight distributions, weightwin
will often generate models with better \(\Delta AICc\) values, which may be especially important when users are most interested in achieving high explanatory power. Furthermore, by using an optimisation routine weightwin
often tests far fewer models than slidingwin
, allowing for more rapid analysis.
Despite these benefits, weightwin
will not always be the most appropriate function for all scenarios. Firstly, the nature of the fitted weight distributions means that weightwin
can only detect single climate signals, which forces users to detect and compare potential climate signals with separate analyses.
Secondly, weightwin
can be a more technical process. While the above example works easily, optimisation procedures can get stuck on false optima or fail to converge. In these cases, users may be required to test different starting parameters and adjust optimisation characteristics such as step size. Often this procedure can be inhibative for users with less technical knowledge.
Finally, weightwin
can only be used for testing mean climate, with no capacity to consider other aggregate statistics. Therefore, whether one chooses to use weightwin
or slidingwin
will largely depend on the summary statistic used, the level of detail desired, and the ones technical knowledge.
To ask additional questions or report bugs please e-mail: