Accounting for autocorrelation in wild boar abundance models
Study area and population simulation
We will simulate a wild boar population over an hypothetical study area. We will use a Poisson distribution to scatter wild boar individuals across a study area:
Ni∼Poisson(λ) (Eq 1)
where λ is the expected numbers of wild boar at each quadrant (cell) of our study area. As animals are not usually random distributed, we will use an environmental covariate (distance to the water point) to drive the variation of λ.

In addition, we will add a random autocorrelated predictor to reproduce some kind of autocorrelation of the wild boar distribution, which can be seen as biological processes such as migration, etc.
log(λi)=β0+β1∗dwati+β2∗autocorri (Eq 2)
where λi is the expected (average) abundance of wild boar at each cell i, β0 is the coefficient for the intercept, dwati is the environmental covariate at cell i and β1 its coefficient, and autocorri is the environmental covariate at cell i and β2
its coefficient. Therefore, the general process that describes the number of wild boar at each cell will be:
N∼Poisson(exp(β0+β1∗dwat+β2∗autocorr)) (Eq 3)
We will set our coefficients as follows:
- β0=3
- β1=−0.9
- β2=0.15
set.seed(123)
library(dismo)
library(raster)
library(sf)
# Create a study area
sarea <- raster(nrows = 100, ncols = 100, xmn = 0, xmx = 100, ymn = 0, ymx = 100)
# Create the "distance from water point" layer
dwat <- scale(distanceFromPoints(sarea, c(25,25))) + 2.26
# Create a random autocorrelated variable
autoc <- raster(nrows = 10, ncols = 10, xmn = 0, xmx = 100, ymn = 0, ymx = 100)
autoc[] <- runif(100, 1,10)
autoc <- disaggregate(autoc,10, "bilinear")
# Lambda parameter for the Poisson distribution of the abundance distribution
# will be a function from the distance of water point and the autocorrelated,
# with a 3, -0.9 and 0.15 as coefficients
beta_0 <- 3
beta_1 <- -0.9
beta_2 <- 0.15
lambda <- exp(beta_0 + beta_1*(dwat) + beta_2*(autoc))
# Now we can fill each cell of our study area with a random number from a Poisson
# distribution with a different lambda at each site/cell
for (i in 1:ncell(sarea)){
sarea[i] <- rpois(1, lambda[i])
}
# Plot the different variables and the study area
par(mfrow = c(2,2))
plot(dwat, main = "Distance to the water point")
plot(autoc, main = "Autocorrelated random covariate")
plot(lambda, main = "Lambda parameter of the Poisson distribution")
plot(sarea, main = "Abundance of wild boar per cell")

Hunting ground simulation and data colecting
Once we have the wild boar population distributed, we will divide the sudy area in 300 different hunting grounds and we will obtain the number of wild boar and a set of possible predictor variables for each hunting ground: area and distance to the water. point.
# Let's create a layer fill with 1 to indicate the area of each cell
uni <- sarea
uni[] <- 1
# Now we'll create hunting grounds from which we'll obtain the counts of our
# population. They'll be irregular polygons for which we will obtain the actual
# number of individuals. Here we'll ignore detectability, and we simulate a
# perfect sampling, being able to obtain the actual number of individuals at
# each hunting ground.
pp <- randomPoints(sarea, 300)
cotos <- crop(voronoi(pp),sarea)
# Now, we'll obtain the number of individuals in each hunting ground, as well as
# the measures of the other two predictor variables: the average of the distance
# to the water point and the area of each hunting ground. We'll also obtain the
# wild boar density by dividing the wild boar number by the area at each
# hunting ground.
cotosr <- (rasterize(cotos, sarea, 'id'))
df <- cbind(zonal(sarea,cotosr, fun = sum), zonal(dwat,cotosr),
zonal(uni, cotosr, fun = sum))
df <- data.frame(df[,c(-3,-5)])
colnames(df) <- c("id", "wb", "dwat", "area")
df$dens <- df$wb / df$area
df$dens_poiss <- round((df$dens * 100),0)
# Take a look to our data. Each hunting ground has an ID, the number of
# individuals, the distance to the water point, the area and the wild boar
# density
head(df)
## id wb dwat area dens dens_poiss
## 1 1 1509 0.7832324 69 21.869565 2187
## 2 2 240 1.0638565 17 14.117647 1412
## 3 3 57 3.2996035 39 1.461538 146
## 4 4 116 1.9671127 23 5.043478 504
## 5 5 85 2.2280306 12 7.083333 708
## 6 6 204 1.6097854 21 9.714286 971
# Let's include this variables in the spatial object too
cotos$wb <- df$wb
cotos$dwat <- df$dwat
cotos$area <- df$area
cotos$dens <- df$dens
cotos$dens_poiss <- round((df$dens * 100),0)
#PLOT
par(mfrow = c(1,2))
plot(sarea, legend = FALSE, main = "Simulated wild boar abundance")
lines(cotos)
plot(sarea,axes=FALSE, legend = FALSE,
main = "Wild boar counts by hunting ground")
plot(st_as_sf(cotos)["wb"], add = TRUE, pal = rev(terrain.colors(10)))

Obtaining the spatial structure of our data
In order to account for the spatial dependency in our data, we should
obtain the spatial structure of our hunting grounds, that is, the
neighborhood of each hunting ground.
# First, we'll prepare neighborhood matrix of territorial units
library(spdep)
nb <- poly2nb(cotos, row.names = cotos$id)
n.neighbors <- sapply(nb, FUN =length)
neighbors <- unlist(nb, recursive = FALSE)
plot(cotos, col = "grey", main = "Hunting grounds neighborhood structure")
plot(nb, coordinates(cotos), col='red', lwd=1.2, add=TRUE, pch = 16)

# Now we should assign each cell to a hunting ground for model projections
sites.pred <- data.frame(rasterToPoints(dwat))
sites.pred$area <- 1
dup <- raster::extract(cotos, sites.pred[,1:2])
cells.pred <- dup[!duplicated(dup$point.ID),2]
sites.pred$cell <- cells.pred
names(sites.pred)[3] <- "dwat"
Model fitting
Now we will follow three different strategies to model wild boar abundance from hunting ground data and to project our predictions in a finer scale (1×1 grid).
- Using the total wild boar counts as response variable and the distance to water point and area of hunting ground as predictors.
- Using the density of wild boar as response variable and the distance to water point as predictor
- Using the density of wild boar as response variable and the distance to water point as predictor and including iCAR to control for spatial autocorrelation in our data.
We will use the R package hSDM and a Poisson distribution to fit all models. A good explanation about hSDM R package can be found here. It uses a Bayesian framework (through MCMC) to fit model parameters.
library(hSDM)
m1 <- hSDM.poisson(
counts = df$wb,
suitability=~ dwat + area,
data=df,
suitability.pred=sites.pred, # For model projection into a finer resolution
burnin = 1000,
mcmc = 10000,
thin = 10,
beta.start=0,
mubeta =0,
Vbeta = 1e+06,
seed = 1234,
verbose = 1,
save.p = 0)
##
## Running the Gibbs sampler. It may be long, please keep cool :)
##
## **********:10.0%, mean accept. rates= beta:0.303
## **********:20.0%, mean accept. rates= beta:0.303
## **********:30.0%, mean accept. rates= beta:0.303
## **********:40.0%, mean accept. rates= beta:0.320
## **********:50.0%, mean accept. rates= beta:0.330
## **********:60.0%, mean accept. rates= beta:0.290
## **********:70.0%, mean accept. rates= beta:0.320
## **********:80.0%, mean accept. rates= beta:0.293
## **********:90.0%, mean accept. rates= beta:0.293
## **********:100.0%, mean accept. rates= beta:0.317
# Here we use the density as response variable and the distance to the water
# point as the only predictor
m2 <- hSDM.poisson(
counts = df$dens_poiss,
suitability=~ dwat,
data=df,
suitability.pred=sites.pred, # For model projection into a finer resolution
burnin = 1000,
mcmc = 10000,
thin = 10,
beta.start=0,
mubeta =0,
Vbeta = 1e+06,
seed = 1234,
verbose = 1,
save.p = 0)
##
## Running the Gibbs sampler. It may be long, please keep cool :)
##
## **********:10.0%, mean accept. rates= beta:0.400
## **********:20.0%, mean accept. rates= beta:0.465
## **********:30.0%, mean accept. rates= beta:0.410
## **********:40.0%, mean accept. rates= beta:0.350
## **********:50.0%, mean accept. rates= beta:0.420
## **********:60.0%, mean accept. rates= beta:0.455
## **********:70.0%, mean accept. rates= beta:0.400
## **********:80.0%, mean accept. rates= beta:0.430
## **********:90.0%, mean accept. rates= beta:0.380
## **********:100.0%, mean accept. rates= beta:0.360
# Finally, we'll use the iCAR approach to have into account the spatial
# autocorrelation
m3 <- hSDM.poisson.iCAR(
counts = df$dens_poiss,
suitability=~ dwat,
spatial.entity = df$id,
data=df,
n.neighbors = n.neighbors, # Spatial structure
neighbors = neighbors, # Spatial structure
suitability.pred=sites.pred, # For model projection into a finer resolution
spatial.entity.pred=sites.pred$cell, # Spatial structure of projections
burnin = 1000,
mcmc = 10000,
thin = 10,
beta.start=0,
Vrho.start=1,
mubeta =0,
Vbeta = 1e+06,
priorVrho = "1/Gamma",
shape = 0.5,
rate = 0.0005,
Vrho.max=1000,
seed = 1234,
verbose = 1,
save.rho = 0,
save.p = 0)
##
## Running the Gibbs sampler. It may be long, please keep cool :)
##
## **********:10.0%, mean accept. rates= beta:0.265, rho:0.256
## **********:20.0%, mean accept. rates= beta:0.235, rho:0.253
## **********:30.0%, mean accept. rates= beta:0.220, rho:0.257
## **********:40.0%, mean accept. rates= beta:0.195, rho:0.260
## **********:50.0%, mean accept. rates= beta:0.275, rho:0.257
## **********:60.0%, mean accept. rates= beta:0.280, rho:0.258
## **********:70.0%, mean accept. rates= beta:0.135, rho:0.258
## **********:80.0%, mean accept. rates= beta:0.300, rho:0.257
## **********:90.0%, mean accept. rates= beta:0.230, rho:0.261
## **********:100.0%, mean accept. rates= beta:0.220, rho:0.256
Now we can predict wild boar numbers obtained for each hunting ground
following the three strategies and compare them with the actual data.
Take into account that m2
and m3
are fitted using densities, and therefore model predictions should be backtransformed to obtain wild boar numbers.
df$predm1 <- m1$lambda.latent
df$predm2 <- ((m2$lambda.latent/100) * df$area)
df$predm3 <- ((m3$lambda.latent/100) * df$area)
par(mfrow= c(1,3))
plot(df$predm1, df$wb, xlab= "wild boar predicted", ylab= "wild boar observed")
abline(a=0,b=1,col="red", lwd = 3)
plot(df$predm2, df$wb, xlab= "wild boar predicted", ylab= "wild boar observed")
abline(a=0,b=1,col="red", lwd = 3)
plot(df$predm3, df$wb, xlab= "wild boar predicted", ylab= "wild boar observed")
abline(a=0,b=1,col="red", lwd = 3)

We can see that iCAR
approach is obtaining the more accurate predictions at hunting ground
level. In addition, we can check if we were able to remove spatial
autocorrelation of model residuals by computing autocorrelograms (using
Moran’s I at different distance lags)
library(spatialEco)
# Obtain model residuals (observed - predicted)
df$m1res <- df$wb - m1$lambda.latent
df$m2res <- df$wb - ((m2$lambda.latent/100) * df$area)
df$m3res <- df$wb - ((m3$lambda.latent/100) * df$area)
dataI <- SpatialPointsDataFrame(coordinates(cotos),
data.frame(cbind(df$m1res,df$m2res,df$m3res)))
par(mfrow = c(1,3))
c1 <- correlogram(dataI, dataI@data$X1, dist = 10, ns = 50, latlong=FALSE)
c2 <- correlogram(dataI, dataI@data$X2, dist = 10, ns = 50, latlong=FALSE)
c3 <- correlogram(dataI, dataI@data$X3, dist = 10, ns = 50, latlong=FALSE)

We can see that spatial autocorrelation is significatively reduced in the firsts lags by using the iCAR approach.
## Model projection to a finer resolution (downscaling)
Finally, a further step would be to project our model to a higher
resolution: that is, to use our models to obtain a prediction of number
of wild boar at finer scale. We could then compare between actual data
at cell sclae and model predictions under each strategy.
projm1 <- sarea
projm1[] <- m1$lambda.pred
projm2 <- sarea
projm2[] <- m2$lambda.pred/100
projm3 <- sarea
projm3[] <- m3$lambda.pred/100
par(mfrow = c(2,3))
plot(dwat, main = "Predictor dwat beta = -0.8")
plot(autoc, main = "Predictor autocorrelated beta = 0.15")
plot(sarea, main = paste("Actual abundance distribution\r N =",sum(sarea[])))
plot(projm1, main = paste("Raw counts model (m1)\r N =",round(sum(projm1[]),0)))
plot(projm2, main = paste("Density model (m2)\r N =",round(sum(projm2[]),0)))
plot(projm3, main = paste("Density + iCAR model (m2)\r N =",round(sum(projm3[]),0)))

As we see, the wild boar abundance distribution obtained in m1
and m2
show a uniform pattern, since the only environmental predictor included in the model is the distance to the water point. However, total numbers of wild boar for the whole study area shows an overprediction when we use the total wild boar counts as the response variable. This overprediction is not controlled by including the hunting ground area as predictor variable, so caution should be taken when downscaling this model predictions. The wild boar abundance distribution obtained from m3
shows a much more realistic pattern, since spatial structure of the data is accounted by the iCAR approach and total number of wild boar predicted is in accordance with actual data.
This script is associated to:
Suggested citation: ENETWILD‐consortium, Javier, Fernández‐López, Marco, Apollonio, Jose Antonio, Blanco‐Aguiar, Francesca, Brivio, Simon, Croft, Angela, Fanelli, Alberto, Fernández‐Arias, Ezio, Ferroglio, Oliver, Keuling, Tomislav, Levanič, Kamila, Plis, Tomasz, Podgórski, Boštjan, Pokorny, Massimo, Scandura, Graham, C Smith, Ramon, Soriguer, Joaquín, Vicente, Stefania, Zanet, Pelayo, Acevedo, 2020. Improving models of wild boar hunting yield distribution: new insights for predictions at fine spatial resolution. EFSA supporting publication 2020: 17( 12): EN‐1980. 27 pp. doi: 10.2903/sp.efsa.2020.EN‐1980