For applied biodiversity monitoring, the risk of incorrectly concluding there is no trend often is more critical than the opposite, e.g., when a threatened species is declining and action is foregone because the trend is statistically not significant. This risk is related to the power of an analysis. Statistical power is a function of measurement precision, annual background variation in the variable of interest, and effect size (the strength of the trend one wishes to detect) but also depends on the method used for analysis. In fact, the target of detecting a 1% annual decline adopted by various policies (e.g., the EU Habitats Directive) is very difficult to achieve and may require many years of monitoring. Module 2 helps identifying suitable methods for power analysis of monitoring data. On this page we provide general background information for power analyses and explain how DaEuMon can be used to evaluate precision and power of monitoring schemes contained in the database.
Introduction. A central criterion for the evaluation of a monitoring scheme is whether it can statistically detect a change of a certain size in the population or distribution of species or habitats. The probability of detecting a trend or change of certain size is statistical power. Statistical power is a function of measurement precision, annual background variation in the variable of interest, and effect size (the strength of the trend or certain size of the change one wishes to detect).
Measurement precision (or 'error', in analogy with experimental design theory) is a composite measure of several features of the monitoring system. Measurement precision is the most important measure of scientific quality. Besides the precision of the survey method used, the following design variables contribute to a good measurement precision: a high number of sites monitored per year, a high number of replicates per sites, and repeated surveys within a year contribute to a good measurement precision (e.g. Wilson et al. 2011b). Also, the nature of data collected (e.g. presence/absence or count data) influence measurement precision. From 1386 times series of annual counts without missing values lasting at least 15 years extracted from the Global Population Dynamics Database Wilson et al. (2011a) estimated that the median measurement error variance was 0.03, with 25th and 75th percentiles of 0.002 and 0.1, respectively. Unfortunately, no similar data are available for presence/absence data, capture-mark-recapture data, and for habitat monitoring. Using only bird data, similar values were obtained.
The background variation - in statistical terms called process error -of the variable of interest (e.g. distribution, population size) from year to year will differ among species, habitats, and locations and thus informed guesses are needed. For single populations of species Gibbs et al. (1998) collated data on variability estimates (natural fluctuations plus measurement error) for a large number of vertebrates, plants, and arthropods (also available on the internet). From 1386 times series of annual counts without missing values lasting at least 15 years extracted from the Global Population Dynamics Database Wilson et al. (2011a) estimated that the median process error variance was 0.03, with 25th and 75th percentiles of 0.04 and 0.23, respectively. Unfortunately, no similar data are available for presence/absence data, capture-mark-recapture data, and for habitat monitoring. Using only bird data, similar values were obtained.
Effect size is the strength of the trend or extent of change one wishes to detect as significant. Effect size is often an arbitrary value obtained from informed guesses. For monitoring schemes, the effect size is likely to be influenced by what is considered as an alarming change in the population or distribution of certain species or habitats.
If the three parameters (measurement precision or "error", background variability, and effect size) can be quantified, then the statistical power to detect a trend can be determined. Gerrodette (1987) provided a method to calculate power when trends are linear; a software for using the method is freely available on the internet. The precision of the power estimate will depend largely on measurement precision, as this is the component most strongly related to the features of the monitoring schemes. Such an estimated statistical power can then be evaluated against a set of criteria developed to judge efficiency over a range of monitoring objects in a range of countries, regions etc.
Policy example. The NATURA 2000 system and the IUCN system provide explicit sets of criteria to evaluate the conservation status of species. For example, the reporting guidelines prepared by the Scientific Working Group of the Habitats Committee of DG Environment of the European Commission suggest using an annual decline of 1% (6 % during a 6-year reporting period) as a threshold value (effect size). If the decline is larger, the conservation status is 'Unfavourable - Bad'. These threshold values are highly ambitious and can be achieved only by schemes that have very high measurement precision low background variability, and run at least a decade with annual surveys. In module 3, we provide recommendations for the design of monitoring schemes to secure adequate power.
- Elzinga CL, Salzer DW, Willoughby JW, Gibbs JP (2001) Monitoring plant and animal populations: a handbook for field biologists. Blackwell Science, Malden, MA
- Gerrodette T (1987): A power analysis for detecting trends. Ecology 68: 1364-1372.
- Gibbs JP, Droege S, Eagle P (1998): Monitoring populations of plants and animals. Biosciences 48: 935-939.
- Wilson HB, Kendall BE, Possingham HP (2011a): Variability in population abundance and the classification of extinction risk. Conserv. Biol. 25: 747-757.
- Wilson HB, Kendall BE, Fuller RA, Milton DA, Possingham HP (2011b): Analyzing variability and the rate of decline of migratory shore birds in Moreton Bay, Australia. Conserv. Biol. 25: 758-766.
EuMon core team; May 2013