Trends in abundance are usually estimated from data collected across multiple years. Here we describe methods suitable for the analysis of such data. See Estimating abundance trends and population change from monitoring data (2 years of sampling) for methods that allow the detection of change between two reporting periods.
Trends in abundance can be estimated with different types of data: presence/absence data, abundance class data, count data (relative numbers), or absolute numbers of individuals. As precise counts are often difficult to achieve, abundance classes may need to be used. Abundance classes of the form 1, 2, few, many also can be used to estimate trends but are more difficult to handle. Rarely, all individuals can be counted, but absolute abundance (population size) may be estimated with statistical methods. However, for most biodiversity monitoring purposes count data are sufficient.
Numerous methods have been suggested to estimate trends from monitoring data and opinions about adequate methods differ. Available methods can be divided into non-parametric and parametric methods. Provided that assumptions of parametric methods can be met, they are preferable to non-parametric approaches because they have higher power and allow a quantification of the strength of trends. Here we provide recommendations for the analysis of different types of data, briefly discuss underlying assumptions relevant for the selection of appropriate methods, and link to pages that describe the use of the methods.
Abundance class data
Absolute abundance (population size): distance methods & Capture Mark Recapture methods
All methods require that detection probability is constant or that it is estimated and accounted for in the estimates. Alternatively, detection probability may be modelled as a function of external variables, such as habitat type or observer identity. A further alternative may be to split the data into groups for which detection probability is constant both spatially and temporally. For example, one may split data by habitat type and then estimate trends separately for each habitat type. This alternative requires that detection probability remains constant in time. The separate trend estimates can be combined afterwards (Integration of monitoring schemes).
Non-parametric methods make fewer assumptions than parametric models. Whereas abundance indices (e.g., counts, percentage of occupied sites) must be linearly related to absolute abundance for parametric methods, only a steady increase or decrease with absolute abundance is required for non-parametric methods. Furthermore, trends do not need to be linear for non-parametric methods, whereas this must be the case for most parametric models. The assumption of a normal or log-normal distribution of the data of parametric models is also not required in non-parametric methods.
For both types of methods, counts should be independent of each other. However, parametric methods exist for serially correlated count data as well.
The power of non-parametric methods usually is substantially lower than that of parametric methods and the conclusions may be rather conservative, i.e. resulting frequently in the lack of detection of changes in abundance. Moreover, in contrast to parametric methods, only the direction of trend can be estimated, not its magnitude. Therefore, parametric methods should be preferred, whenever assumptions can be met. Similarly, presence/absence data tend to have low power to detect trend than count data (Strayer 1999), though further simulations are required before this can be generalized to all types of presence/absence surveys.
To determine trends from presence/absence data, the percentage of occupied sites need to be calculated first, and then regressed against time with non-parametric methods or with parametric models. We recommend using either standard linear regression or logit regression of the percentage of occupied sites against time after arcsin transformation of the percentage data. If detection probability is not constant, we advise to use occupancy models to estimate detection probability and occupancy rate. This is possible only if surveys were repeated at least once within a year (see Designing surveys to account for detectability)
Abundance class data
Often indices of abundance, e.g. the number of calling individuals or the number of birds flying across a gap, cannot be precisely counted. Therefore, count data are frequently presented as abundance class data, e.g., 0, 1-5, 6-10, 11-25, 26-50 etc. individuals. The simplest way to test for trends of abundance is to convert the data into presence/absence data and to proceed as explained for this type of data. However, this approach loses information. Alternatively, one may use the midpoint of each abundance class as count. Care is required, because such data may violate assumptions of parametric regression methods, especially, if the width of the classes are not constant or cannot be made constant by appropriate, e.g. logarithmic, transformation. If assumptions of parametric methods cannot be met, non-parametric methods) should be used.
See Royle (2004) for a recent Bayesian method to estimate abundance when abundance classes are of the form: none, some, many individuals. An online supplement of Royle & Link (2005) provides a program for such data that allows the inclusion of covariates that may influence detection probability, such as temperature.
Introduction. Besides non-parametric methods, a large number of parametric methods have been suggested for the analysis of trends in count data. If assumptions of parametric methods can be met, we recommend their use. These methods can be divided into those that do not consider serial autocorrelation and those that account for it. Serial autocorrelation of the data is the temporal dependency of consecutive counts. Despite acknowledging its presence, the majority of commonly applied methods do not consider it, especially when monitoring species across many populations or larger geographic space. As it is not yet sufficiently understood whether, and if so to which extent, serial correlation influences detection of trends and estimates of their strengths, serial correlation should be considered whenever feasible, especially when analysing data from single (meta-)populations. In any case, the presence of serial autocorrelation should be tested whenever analysing and interpreting such monitoring data.
Parametric methods - basic assumptions for count data. Here we discuss assumptions that are generally relevant for parametric methods. All require that the count data are linearly correlated with abundance and that the data are lognormal distributed or that these assumptions can be met by appropriate transformation. This crucial assumption is often violated in count data because detection probability generally declines at low density (e.g. Henke 1998, Rodda et al. 2005) leading to a negative bias, i.e., indicating a decline even if there is none. Fortunately, this bias is in line with the philosophy of a cautionary approach in biodiversity conservation in which management action is started rather too early or too often than too late or too seldom. At high abundance, detection probability often declines with abundance, which will cause a bias against detecting declines or increases.
Parametric methods -a brief overview. A large number of methods have been suggested and used for analysing count data. Standard linear regression models are among the easiest and most commonly used methods. Standard linear regression models are appropriate if abundance is primarily driven by habitat availability and the expected amount of habitat conversion remains the same each year. Then, a linear relationship is expected up to the thresholds at which edge effects or fragmentation effects accelerate the effects of habitat loss (for fragmentation thresholds, see Andren 1994). However, standard linear regression does not account for temporal autocorrelation of counts. Temporal autocorrelation will be present, if for example each year a constant fraction of the habitat will be lost or if the change in abundance depends on the abundance of the species in the previous year. Under temporal autocorrelation, standard linear regression models tend to overestimate trends and often indicate false declines. Notwithstanding, they have high power for detecting annual declines > 3% but power is low for smaller declines (Wilson et al. 2011a).
A simple alternative used in the past (Caughley 1980) is to calculate the mean growth rate of the population (calculated as the mean of ln Nt+1 - ln Nt) and to test whether the mean differs significantly from zero. While this estimate of the mean growth rate is unbiased, variance estimates may be biased (Williams et al. 2002) and thus testing the hypothesis that the mean is significantly different from zero is not reliable.
More complex alternatives are generalized least square linear regression models that account for autocorrelation of residuals. While conceptually preferable, they detected fewer declines and often did not provide a better fit than standard regression models for a set of bird data (Wilson et al. 2011b). Thus, their performance still needs further evaluation.
For the analysis of large, volunteer based monitoring data (especially for birds and butterflies) a range of specific regression models, such as the chain index, the Mountford index, route regression, and log-linear Poisson regression models, have been used. They have the advantage that larger numbers of missing data can be accounted for (ter Braak et al. 1994). Route regression is a comparably simple approach that is used for example in North American breeding bird surveys. It takes differences in between-year changes across sites into account and allows incorporation of covariates, such as observer effects. It has the advantage that trend estimates are unbiased in the presence of serial autocorrelation in count data. However, the confidence interval of trends and tests of significant trends may be biased (Dennis & Taper 1994). A recent alternative is the use of log-linear Poisson regression models. They are used for example by the Dutch common bird monitoring (van Strien et al. 2004). They can deal directly with missing values, even if they occur frequently, a common situation in larger monitoring schemes. They also allow modelling differences in detection probability among habitats, sites, or observers (but they do not allow estimation of detection probability). Open access specialized software (TRIM) is available for log-linear Poisson regression models, with recent versions accounting for temporal autocorrelation of count data by assuming autocorrelation of residuals (van Strien et al. 2004).
Another alternative is time series analysis based on population dynamics (Dennis et al. 1991, 2006, Dennis & Taper 1994, Kery et val. 2009) and the ARIMA methodology developed by Box & Jenkins (1976), especially when dealing with single populations. However, monitoring data are rarely of sufficient length to allow the use of the full power of ARIMA. Additionally, ARIMA is a complex technique that is not easy to use. It requires a great deal of experience, and, although it often produces satisfactory results, those results depend on the researcher's level of experience (Bails & Peppers 1982).
The time series model of Dennis & Taper (1994) can handle shorter time series and situations better, in which observations are made at unequal time intervals. Its main draw back is that count data must be measurement error free. Recent extensions, called state-space models, remediate this draw-back by combining the process model with a model for measurement error (Dennis et al. 2006, Kery et al. 2009). The model of Dennis et al. (2006) has the advantage that it incorporates density dependency mechanistically in the model structure but it does not explicitly model detection probability and thus may be sensitive against systematic variation in detection probability. Dennis et al. (2006) provide instructions how to use their model for trend estimation using a repeated-measure analysis of variance model with a random time effect in SAS. The model of Kery et al. (2009), in contrast, explicitly accounts for detection probability but does not incorporate density dependency. It further requires that repeated counts have been made during which the populations remained closed (to allow estimation of detection probability). Kery et al. (2009) provide a code to conduct the analyses within the Bayesian WinBUGS software.
Like standard linear regression models, state-space models tend to overestimate declines but rarely indicate false declines. However, their power to detect declines is low (much lower than for standard linear regression models) unless annual declines are high (>10%) (Wilson et al. 2011a); moreover, estimates of one of the two noise parameters often are zero, which is a biologically unlikely case. This, however, can be dealt with by constraining parameters (Wilson et al. 2011b). Also, they do not explicitly model the detection process and thus may be sensitive to systematic variation in detection probability (Kery et al. 2009).
Count data - which method to use? If a rapid and easy to use method is required and if a tendency to detect false declines is not a worry, we suggest using standard linear regression. Standard linear regression models are also appropriate if abundance is determined primarily by extrinsic factors, whose change does not depend on the previous year, such as a constant amount of habitat conversion. If one wants to avoid a tendency to detect false trends and power is not an issue, non-parametric methods are a simple alternative. If monitoring single (meta-)populations with thoroughly standardized survey methods and density dependency is likely and time series are long (30 time steps with observations), the use of the state-space models of Dennis et al. (2006) or ARIMA are suitable alternatives. If constant detection probability rather than density dependence is the main worry, then the method of Kery et al. (2009) may be used instead. Because state-space models have low power, we advice that standard linear regression models are used in parallel. For long time series with few missing values ARIMA is the most flexible option. For large-scale monitoring data Route Regression or log-linear Poisson regression models are promising. The open access software TRIM facilitates the use of the latter.
In summary, linear modelling, with appropriate transformation of data, link-function and parameterization of sites and year effects, intrinsically accounts for the main problems faced when analysing temporal series of monitoring data: heterogeneity among sites, among observers, through time and in precision. Three basic properties are worth highlighting. First, the computation of temporal trends does not require complete time series; missing counts are accounted for. Second, including site effects in the statistical model largely accounts for differences in detection ability among observers and habitats. For this reason, it is strongly recommended that each site is monitored by the same observer, as long as he/she is involved in the monitoring scheme. Third, the trend in abundance is not the simple difference with the first or the last year of the monitoring, but instead data of all years equally contribute to the trend.
Absolute abundance (population size): distance methods & CMR methodsa
Rarely it is possible to count all individuals of a population or of a particular area. However, some monitoring schemes may use statistical methods for estimating absolute abundance. Point count and linear transect methods as well as capture-mark-recapture (CMR) methods are commonly used for estimating population size. Estimating population size is recommended when monitoring single species across a limited area and when detection probability is likely to vary. Across large areas such estimates may not be logistically feasible, except for the double-observer approach), an approach that is not yet used widely.
In point count surveys, individuals are counted either in a fixed or an open radius around selected sampling points. Similarly, in line transect sampling individuals are observed along transect lines. Point count data are collected frequently for birds. Linear transect methods are used especially for mobile species - besides for birds, particularly for butterflies and whales. The advantage is that individuals do not need to be individually recognizable. CMR methods are applied to a wide range of different organisms. They are suitable when the surveyed area has a limited extent and when specimens can be individually recognized by natural marks or by marking them and detection probability is not too low (above 20%; >10% for large populations).
Line transect and point count methods make the critical assumption that all individuals close to the transect line respectively at the centre of the circle are observed, an assumption that is frequently violated in natural populations. Therefore, these estimates are better regarded as counts of relative abundance, requiring only the less restrictive but nevertheless crucial assumption that detection probability is constant across transects, sampling points, and time. When carefully executed, the double-observer approach (Nichols et al. 2000) may be used to estimate detection probability and correct counts of point count and line transect surveys.
Sophisticated statistical methods and specialized software (program DISTANCE; program MARK exist for the estimation of population size from point count and linear transect data as well as for CMR data (Buckland et al. 2004, Williams et al. 2002). For CMR data, we recommend to use program CAPTURE, which is implemented in MARK, as it has a good theoretical foundation and often performs well. Open population models in MARK also allow an estimation of population size. The recruitment model of Pradel (1996) that is implemented in MARK has the advantage that it allows a direct estimation of the population growth rate l, but it requires that marked and unmarked individuals have identical parameter values, i.e., the same demographic parameters and detection probability. The latter is violated if detection probability changes as a result of previous observation or if detection probability differs among individuals. If these assumptions are violated and the model cannot be used one should estimate population size with alternative models and use the same methods to test for trends as recommended for count data. Both MARK and DISTANCE are accompanied by considerable documentation for users.
- Andrén H (1994): Effects of habitat fragmentation on birds and mammals in landscapes with different proportions of suitable habitat: A review. OIKOS 71: 355-366
- Bails DG, Peppers LC (1982): Business Fluctuations. Forecasting Techniques and Applications. Englewood, Prentice-Hall.
- Box GEP, Jenkins GM (1976): Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco.
- Buckland ST, Anderson DR, Burnham KP, Laake JL, Borchers DL, Thomas L (2004): Advanced Distance Sampling. Estimating Abundance of Biological Populations. Oxford University Press, Oxford.
- Caughley C (1980): Analysis of Vertebarte Populations. Wiley, New York.
- Dennis B, Taper ML (1994): Density dependence in time series observations of natural populations: estimation and testing. Ecological Monographs 64(2): 205-224.
- Dennis B, Munholland PL, Scott JM (1991): Estimation of growth and extinction parameters for endangered species. Ecological Monograph 61(2): 115-143.
- Dennis, B, Ponciano JM, Lele SR, Taper ML, Staples DF (2006): Estimating density dependence, process noise, and observation error. Ecological Monograph 76(3):323-341.
- Henke SE (1998): The effect of multiple search items and item abundance on the efficiency of human searchers. Journal of Herpetology 32: 112-115.
- Henle K, Sarre S, Wiegand K (2004): The role of density regulation in extinction processes and population viability analysis. Biodiversity and Conservervation 13: 9-52.
- Kery M, Dorazio RM, Soldaat L, van Strien A, Zuiderwijk A, Royle JA (2009): Trend estimation in populations with imperfect detection. Journal of Applied Ecology 46: 1163-1172.
- Nichols JD, Hines JE, Sauer JR, Fallon FW, Fallon JE, Heglund PJ (2000): A double-observer approach for estimating detection probability and abundance from point counts. The Auk 117: 393-408.
- Pradel R (1996): Utilization of capture-mark-recapture for the study of recruitment and population growth rate. Biometrics 52: 703-706.
- Rodda GH, Campbell EW, Fritts TH, Clark CS (2005): The predictive power of visual searching. Herpetological Review 36: 259-264.
- Royle JA (2004): Modeling abundance index data from anuran calling surveys. Conservation Biology 18: 1378-1385.
- Royle JA, Link WA (2005): A general class of multinomial mixture models for anuran calling survey data. Ecology 86(9): 2505-2512.
- Strayer DL (1999): Statistical power of presence-absence data to detect population declines. Conservation Biology 13: 1034-1038.
- Ter Braak CJF, van Strien AJ, Meijer R, Verstrael TJ (1994): Analysis of monitoring data with many missing values: which method? Pp. 663-673. In: Hagemeijer EJM, Verstrael TJ (eds.): Bird Numbers 1992. Distribution, Monitoring and Ecological Aspects. Statistics Netherlands, Voorburg/Heerlen & SOVON, Beek-Ubbergen.
- Van Strien A, Pannekoek J, Hagemeijer W, Verstrael T (2004): A loglinear Poisson regression method to analyse bird monitoring data. Bird Census News 13: 33-39.
- Williams BK, Nichols JD, Conroy MJ (2002): Analysis and Management of Animal Population. Academic Press, San Diego.
- Wilson HB, Kendall BE, Possingham HP (2011a): Variability in population abundance and the classification of extinction risk. Conservation Biology 25: 747-757.
- Wilson HB, Kendall BE, Fuller RA, Milton DA, Possingham HP (2011b): Analyzing variability and the rate of decline of migratory shore birds in Moreton Bay, Australia. Conservation Biology 25: 758-766.
EuMon core team; July 2013