Large-scale monitoring data are frequently characterized by many missing values because not all sites are monitored in all years. Furthermore, such data are usually not normally distributed and transformation of data is often unsatisfactory. Transformation can be circumvented by the use of loglinear regression, which is based on the assumption of independent Poisson distributions for the counts. Log-linear Poisson regression models have become a commonly used approach for analysing large-scale monitoring data. By using appropriate weights they can also deal with the problem that some types of areas may be oversampled, whereas others are undersampled. This happens when volunteers prefer to count sites in more attractive areas. Recent advances can deal also with serial autocorrelation. Special software to analyse such data is available.
Assumptions. Log-linear Poisson regression is a form of generalized linear modelling (McCullagh & Nelder 1989) that tests for a linear correlation of the logarithm of expected counts with time. The approach assumes that the counts follow a Poisson distribution in which the variance is proportional to the mean but recent advances implemented in the software TRIM can deal also with overdispersion of the data. The approach further assumes that counts are linearly correlated with absolute abundance. Thus, detection probability must remain constant in time or counts must be adjusted for a change in detection probability with time. Therefore, it is advisable that sites are always monitored by the same observer. Differences among observers or sites (habitats) in detectability can be dealt with as long as detection probability remains constant in time. The basic model also assumes that changes over years are the same across sites but this assumption can be relaxed by appropriate analysis or model adjustment. Log-linear Poisson regression further assumes that counts are independent, i.e., that there is no serial correlation, an assumption frequently violated in monitoring data. A recent approach to deal with this problem is the assumption of autocorrelation of the residuals as implemented in the software TRIM. If serial correlation is a main concern and time series are long, formal time series analysis (ARIMA) may be a better alternative but require considerable skills of the user. Consult the book of Box & Jenkins (1976) and the manuals of major statistical software packages for the application of ARIMA.
Analysis. A trend model expressed as a loglinear model has the following form (ter Braak et al. 1994):
log(expected countij) = log(site effecti) + log(bj) * yearj
The subscripts i and j specify site identity respectively the year of observation, where b1 = 1. For year j the expected counts is a factor bj times the expected count for the first year. Since the parameters bj are independent of i, this basic model implies that the yearly changes are the same for each site. This assumption can be relaxed by using some properties of sites as a covariate for the year effects. For example, if sites can be classified according to habitat type (e.g. woodland, farmland, wetland), the model could be applied separately for each habitat type.
Note that the use of the logarithm of the expected counts is different from the more traditional approach to take the logarithm of the counts themselves. It can result in much better fitted models in the case of many zero counts. Numerical methods are required in Poisson regression models. Poisson regression models are available in large software packages, such as SPSS, GLIM, and in the free software R but their algorithms may be slow for large data sets. Statistics Netherlands has developed freely available software for the analysis of monitoring data with loglinear Poisson regression models, called TRIM (Pannekoek & van Striemen 2001). TRIM uses an iteratively reweighted least-square algorithm to predict counts that were missing based on the estimated site and year effects.
A recent improvement to account for non-perfect detection has been suggested by Kery et al. (2009). They model the observation process as a binomial process separately from the Poisson distributed process of trend in true abundance and couple the two Generalized Linear Models. They provide a software code to estimate trends with the WinBUGS software, which uses a Bayesian framework for parameter estimation.
- Box GEP, Jenkins GM (1976): Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco.
- Kery M, Dorazio RM, Soldaat L, van Strien A, Zuiderwijk A, Royle JA (2009): Trend estimation in populations with imperfect detection. J. Appl. Ecol. 46: 1163-1172.
- Pannekoek J, van Strien AJ (2001). TRIM 3 Manual (TRends & Indices for Monitoring data), CBS, Voorburg, The Netherlands
- Sokal RR, Rohlf FJ (1981): Biometry. Freeman, New York.
- Ter Braak CJF, van Strien AJ, Meijer R, Verstrael TJ (1994): Analysis of monitoring data with many missing values: which method? Pp. 663-673. In: Hagemeijer EJM, Verstrael TJ (eds.): Bird Numbers 1992. Distribution, Monitoring and Ecological Aspects. Statistics Netherlands, Voorburg/Heerlen & SOVON, Beek-Ubbergen.
- Van Strien A, Pannekoek J, Hagemeijer W., Verstrael T. (2004): A loglinear Poisson regression method to analyse bird monitoring data. Bird Census News 13: 33-39.
EuMon core team; July 2011