EU-wide monitoring methods and systems of surveillance for species and habitats of Community interest
  A research project funded by the European Union 
  The EuMon integrated Biodiversity Monitoring & Assessment Tool
 BioMAT > Background info module 2 > Estimating detection probability
Estimating detection probability
To estimate detection probability, one needs to sample from a population of known size. In mark-recapture studies, this is achieved by individually marking animals and releasing them back into the population. Then the population is sampled again and the number of recaptured individuals is registered. Modern capture-mark-recapture models incorporate detection probability as a parameter that can be estimated from recapture data. Recently, similar models have been developed for estimating the number of occupied sites. More challenging is the estimation of detection probability in count surveys.

Presence-absence data
Recently, several methods have been developed to estimate site occupancy rates and detection probability in presence-absence surveys. All of them require that sampling is done repeatedly within a short enough period so that the status of all sites (occupied versus non-occupied) did not change. The most advanced methods are site occupancy models (MacKenzie et al. 2006) that allow estimation of the number of occupied sites, detection probability as well as extinction and colonization rates. Heterogeneous detection probability can be modelled as covariates, e.g. as characteristics of sites, such as area or habitat type, or of survey conditions, such as temperature or cloud cover. While approaches exist that allow modelling detection probability as a function of abundance, these models need further evaluation.
It may not always be possible to sample all sites at all occasions for a particular monitoring period. Occupancy models can account for such missing values. Care must be taken that the missing sites or biased in terms of their likeliness to become occupied or extinct. See Missing data for such cases.
For most occupancy models estimation has to be carried out numerically. A user-friendly software PRESENCE can be use for that purpose. Estimates may be strongly biased if detection probability is low (< 0.15) and the number of sampling occasion is small (< 7) (MacKenzie et al. 2002). See recommendations for designing presence/absence surveys.
The estimated number of sites occupied divided by the total number of sites surveyed provides an estimate of the percentage of occupied sites that then can be used in logit regression models to estimate trends in the rate of occupancy.

Count data
Three main methods have been suggested to estimate or account for detection probability in count data, one for point-count data and two both for point-count data and linear transect methods. The first method divides the count period into three or more intervals and registers, which individuals were detected for the first time in which of the intervals. It then uses a removal method to estimate detection probability. This can be done numerically with the software SURVIV (Farnsworth et al. 2002). In addition to the general assumptions of point-counts (e.g., individuals do not move in and out of the area, are not double counted, and are not silenced by disturbance caused by the observer), a critical assumption is that individuals can be divided into two groups: one group that is detected with certainty in the first interval of the count period and a second group with lower detection probability. This assumption is likely to be violated to some degree. Simulation studies are still needed to asses the robustness of estimators against violations of this assumption.
A second approach models detectability mathematically as a function of distance from the transect line or from the centre of the point-count area. This approach is called distance method and special software, DISTANCE, is available for the analysis of such data (Buckland et al. 2004). It requires that the distance of individuals to the transect line or the centre of the point count area is measured or estimated. It further assumes that all individuals at the centre respectively transect line are detected with certainty - an assumption that is frequently violated. In cases where the assumption is not violated, complete counts may be feasible and should be considered instead of distance counts since they greatly facilitate data collection and analysis.
The third method is a double observer approach (Cook & Jacobson 1979, Nichols et al. 2000). In this case, two observers simultaneously count at each selected point. One observer is designated as "primary" and the other as "secondary". The primary observer identifies all individuals detected and communicates it the secondary observer. The secondary observer records the individuals but also surveys the area and additional records individuals not detected by the primary observer. At the end of each count, the data are the number of individuals (1) detected by the primary observer and (2) missed by the primary observer but detected by the secondary observer. Each observer must serve at least one time, ideally alternatively, both roles as primary and as secondary observer. Then detection probability can be estimated as:

with x12 being the number of individuals counted by observer i (i = 1,2) on stops when observer j (j = 1,2) was the primary observer. The counts for the primary observer include all individuals detected by him/her, whereas the counts for the secondary observer include only individuals missed by the primary observer.
Two critical assumptions are that detection probability among sites remains constant. Therefore, stratification is required if differences in detection probabilities are suspected among different types of habitats. Second, it is essential that the detection of individuals by the primary observer is independent of detections by the secondary observer. Thus, the secondary observer must avoid anything that could provide cues that he/she detected an individual.
The approach can be extended to the simultaneous count of several species and various model constraints, e.g. that within species detection probability is the same for each observer (Nichols et al. 2000).

Mark-recapture data
Most modern mark recapture methods incorporate detection probability directly in their model structure. As it is often regarded as a nuisance parameter, some methods and software may not provide estimates of it but mean capture probability can be obtained by dividing the number of different individuals captured by the estimated population size. Also, while detection probability is important for the design of monitoring schemes, the estimated population size is the quantity needed to assess trends.
A large range of different statistical methods exist that allow capture probability to vary with time, among individuals, or as a behavioural response, as well as all combinations thereof (Chao et al. 2001, Williams et al. 2002, Baillargeon & Rivest 2007). Several methods allow modelling capture probability as a function of covariates.
Various software packages exist for the analysis of mark-recapture data. We particularly recommend the free software MARK as it has a very detailed user manual, incorporates a comparably large number of different models, and is user friendly. Log-linear models can be used with the Rcapture package of the free software R. This package includes some diagnostic tools that are not available in other software packages but experience about the performance of the set of log-linear models is still more limited than for the methods incorporated in MARK. Coverage estimators, which are promising when detection probability is heterogeneous, are available in the software CARE-2.

Key references

  • Baillargeon S, Rivest LP (2007). Rcapture: Loglinear models for capture-recapture in R. J. Stat. Software 19(5): 1-31.
  • Buckland ST, Anderson DR, Burnham KP, Laake JL, Borchers DL, Thomas L (2004): Advanced Distance Sampling. Estimating Abundance of Biological Populations. Oxford University Press, Oxford.
  • Chao A, Yip PSF, Lee SM, Chu W (2001). Population size estimation based on estimating functions for closed capture-recapture models. J. Stat. Planning Inference 92: 213-232.
  • Cook RD, Jacobson JO (1979). A design for estimating visibility bias in aerial surveys. Biometrics 35: 735-742.
  • Farnsworth GL, Pollock KH, Nichols JD, Simons TR, Hines JE, Sauer JR (2002). A removal model for estimating detection probabilities from point-count surveys. The Auk 119: 414-425.
  • MacKenzie DI, Nichols JD, Lachman GB, Droege S, Royle JA, Langtimm CA (2002). Estimating site occupancy rates when detection probabilities are less than one. Ecology 83: 2248-2255.
  • MacKenzie DI, Nichols JD, Royle JA, Pollock KH, Bailey LL, Hines JE (2006): Occupancy Estimation and Modeling. Inferring Patterns and Dynamics of Species Occurence. Elsevier, Amsterdam.
  • Nichols JD, Hines JE, Sauer JR, Fallon FW, Fallon JE, Heglund PJ (2000): A double-observer approach for estimating detection probability and abundance from point counts. The Auk 117(2): 393-408.
  • Williams, B.K., Nichols, J.D. & Conroy, M.J. (2002): Analysis and Management of Animal Populations. San Diego, Academic Press.

EuMon core team; May 2013


Module 2 background information
Other BioMAT background information
Overview BioMAT background information
 Contract number: 006463