Spatial Data

Bayesian Methods for Ecological and Environmental Modelling

Author

Peter Levy

Published

March 22, 2024

library(ggplot2)
library(rworldmap)

Loading required package: sp

### Welcome to rworldmap ###

For a short introduction type :      vignette('rworldmap')

library(raster)
library(geoR)

--------------------------------------------------------------
 Analysis of Geostatistical Data
 For an Introduction to geoR go to http://www.leg.ufpr.br/geoR
 geoR version 1.9-2 (built on 2022-08-09) is now loaded
--------------------------------------------------------------

set.seed(448)

#Define geographic projections to be used
# lat / lon 
projlonlat <- CRS("+proj=longlat +datum=WGS84")

Introduction

In this practical session, we will look at real examples of applying the Bayesian approach to statistical spatial modelling. The field of spatial modelling has developed its own terminology which can be confusing, but the gist is simple. We have observations of a variable $z$ at a number of locations. To estimate or predict $z$ at a new location, we could use the mean of the observations weighted by their proximity (i.e.by the inverse of distance from the new location). Nearby points have a strong influence, far-away points less so.

A more sophisticated approach is to model how proximity actually influences the weights, based on the observed data. This model is known as a (semi)variogram - for every pair of points, we calculate the distance separating them and the difference in $z$ (expressed as half of the squared difference). Often, nearby points tend to be similar; as distance between points increases, they are generally less similar, so the semivariance increases to an asymptote, which is the overall variance in $z$.

*Variogram model showing the typical increase in dissimilarity as distance between points increases. The commonly used terms for the parameters of such a model are shown.*

Using the weights calculated from a fitted variogram model to predict values on a grid around the observations is known as “kriging”, and is a standard way of interpolating spatial data. Conventional kriging fits the variogram model by least-squares, and then treats this as a fixed, known quantity. The Bayesian approach recognises that we do not know the true variogram model parameters, but estimates their posterior probability distribution, given the data. We can then sample many possible realisations of the variogram parameters, and thereby produce many possible realisations of the predictions (the interpolated values on a grid). Considering the variance among these realisations gives us a good estimate of uncertainty in the predicted values of $z$, and potentially better estimate of the mean.

We now consider two examples of of applying Bayesian kriging to real spatial estimation problems. We use the the geoR package in R to do this. This allows us to use (hopefully) familar R function syntax, and saves us having to write out the whole geostatistical model in JAGS.

Case study 1: Spread of Agriculture in the Neolithic Period

Around 8000 years ago, human society made the transition from hunter-gathering to growing domesticated crops - the start of agriculture. Radiocarbon data from agricultural artefacts can give an estimate of the earliest date for agriculture at archaeological sites. Isern et al. [-@Isern2012] collate such a data set, and infer the pattern in the spread of agriculture in Neolithic times. Here, we reanalyse this data, using Bayesian kriging to provide estimates of uncertainty.

# First, we load the data, put it in an empty raster and plot it.
load(url("https://raw.githubusercontent.com/NERC-CEH/beem_data/main/neolithic.RData"), verbose = TRUE)

Loading objects:
  df

data(coastsCoarse)
r <- raster(ext = extent(df), nrows = 20, ncols = 20)
r <- rasterize(df, r, field = "CALBP")
plot(r)
plot(coastsCoarse, add=TRUE, col='blue')
points(df)
title(main = "Date of earliest agriculture", sub = "000 years Before Present")

grid_coords <- coordinates(r) # save the xy locations for geoR
edinburgh <- c(-3, 56)        # a location for prediction in lon, lat
minsk     <- c(29, 56)        # a location for prediction in lon, lat
grid_coords_2testSites <- rbind(edinburgh, minsk)

The $z$ values show the earliest date of agriculture at 100 sites, in 000s of years before present (8 ka BP = 8000 years ago).

Is there a spatial pattern in these data? How would you interpret it?

We convert the data to a “geodata” object used by the geoR package, and use its built-in summary and plot routines.

# df is the common generic data frame name
# dg is my analagous generic geodata object name
dg <- as.geodata(df, data.col = 4) # date "CALBP" is in the fourth column
summary(dg)

Number of data points: 100 

Coordinates summary
    LONGITUDE LATITUDE
min     -6.21    35.51
max     30.18    59.35

Distance summary
        min         max 
 0.09219544 38.38451901 

Data summary
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
3.61100 5.73025 6.49450 6.52259 7.25125 8.76900

plot(dg)

What features do you notice in the data? $~$

$~$

Construct a variogram

We construct a variogram using the geoR variog() function. This creates bins for separation distance and calculates the semivariance in each.

vgm <- variog(dg)

variog: computing omnidirectional variogram

plot(vgm)

Does the variogram look reasonable? $~$

$~$

Examine the vgm object. vgm$n gives the number of points in each distance bin. The greatest separation distances have fewest points, so carry less weight. We will use the krige.bayes function to estimate the parameters of a vargiogram model, and to predict values on the regular grid shown in the first figure.

Specifying priors

First, we need to specify the prior distributions for the $\phi$, $\tau^2$ and $\sigma^2$ parameters (and also for $\beta$, the mean of $z$, although this is not a spatial parameter - yet).

geoR does not use MCMC to estimate parameters, but an approximation. One of the numerical tricks it uses is to discretise some of the prior distributions to make computation easier. We do this here for $\phi$ and $\tau^2$:

range $\phi$ - we assume a uniform distribution between 0 and twice the maximum in the data
intercept $\tau^2$ - this is specified as a fraction of the asymptote $\sigma^2$, “tausq.rel”. We assume a uniform distribution between 0 and 0.9.
asymtote $\sigma^2$ - we assume a “reciprocal” prior, where larger values become diminishingly probable, in inverse proportion to $\sigma^2$
mean $\beta$ - we assume a uniform distribution between 0 and infinity i.e. completely uninformative; “flat” in the geoR syntax.

The lines below specify these distributions, which are plotted below.

maxdist  <- vgm$max.dist
seqphi_sparse <- seq(0, 2*maxdist, by = 5)
seqtau_sparse <- seq(0, 0.9, by=0.05)

Estimate the vargiogram model parameters

Now we use the krige.bayes function to estimate the variogram parameters. We do not make predictions over the grid yet, to make the computation quick.

system.time(
bsp <- krige.bayes(
  dg,
  loc="no",
  prior=prior.control( 
    beta.prior="flat",
    sigmasq.prior="reciprocal",
    phi.prior="uniform", 
    phi.discrete=seqphi_sparse,
    tausq.rel.discrete = seqtau_sparse,
    tausq.rel.prior="uniform"),
  output=output.control( n.posterior=10000, messages=FALSE)
  )
)

   user  system elapsed 
   0.81    0.11    0.92

The plot below shows the mean and median of the posterior distribution, plus some limits on the envelope of the distribution.

par(mfrow = c(1, 1))
vgm_lfit <- likfit(dg, ini = c(0.5, 0.5), fix.nugget = FALSE, messages = FALSE)
vgm <- variog(dg)

variog: computing omnidirectional variogram

vgm_env <- variog.model.env(dg, obj.v = vgm,
                             model.pars = vgm_lfit)

variog.env: generating 99 simulations (with  100 points each) using the function grf
variog.env: adding the mean or trend
variog.env: computing the empirical variogram for the 99 simulations
variog.env: computing the envelops

plot(vgm, env = vgm_env)
lines(bsp, summary.posterior = mean)
lines(bsp, summ = median, lty = 2)
#lines(bsp, summ = "mode", post = "par", lwd = 2, lty = 2)
legend(0.15, 5, legend = c("posterior mean",
  "posterior median"), lty = c(1, 2), lwd = c(1, 1), cex = 0.8)

We can now plot the posterior distribution of the four parameters

par(mfrow = c(2, 2))
plot(bsp)
#str(bsp$predictive)
plot(density(bsp$posterior$sample$sigmasq), main = "", xlab = expression(sigma^2))
plot(density(bsp$posterior$sample$beta), main = "", xlab = expression(beta))

and extract the parameter posterior means and standard deviations.

colMeans(bsp$posterior$sample)

     beta   sigmasq       phi tausq.rel 
 6.392279  3.023315 33.953000  0.094825

apply(bsp$posterior$sample, 2, sd)

       beta     sigmasq         phi   tausq.rel 
 1.33462375  2.81171818 19.90146723  0.08589067

Can you relate these values back to the variogram? Are we reasonably confident in the estimated variogram? $~$

$~$

Make predictions

Next we make our posterior predictions over the grid covering the whole of Europe. This is more time-consuming, so we produce only 100 samples from the posterior, instead of 10000 as above.

system.time(
bsp <- krige.bayes(
  dg,
  loc=grid_coords,
  prior=prior.control( 
    beta.prior="flat",
    sigmasq.prior="reciprocal",
    phi.prior="uniform", 
    phi.discrete=seqphi_sparse,
    tausq.rel.discrete = seqtau_sparse,
    tausq.rel.prior="uniform"),
  output=output.control( n.posterior=100, messages=FALSE)
  )
)

   user  system elapsed 
  29.38    0.35   29.75

r_mean  <- setValues(r, bsp$predictive$mean)
r_var  <- setValues(r, bsp$predictive$variance)
r_s1 <- setValues(r, bsp$predictive$simulations[,2])
r_s2 <- setValues(r, bsp$predictive$simulations[,3])
r_s3 <- setValues(r, bsp$predictive$simulations[,1])
zrange <- c(min(df$z), max(df$z))

par(mfrow=c(1,2))
plot(r, main = "Observations", zlim = zrange)
plot(coastsCoarse, add=TRUE, col='blue')

plot(r_mean, main = "Predictions", zlim = zrange)
points(df)
plot(coastsCoarse, add=TRUE, col='blue')

Do the predictions from Bayesian kriging interpolate in a reasonable way? $~$

$~$

Critically, Bayesian kriging gives us estimates of uncertainty in the interpolation. The variance plot below shows where we are confident of predictions and where we are not. This comes from looking across the possible realistions of mapped predictions in our posterior distribution. Three samples (realistions) are shown below.

par(mfrow=c(1,2))
plot(r_var, main = "Variance")
points(df)
plot(coastsCoarse, add=TRUE, col='blue')

plot(r_s1, main = "One realisation", zlim = zrange)
points(df)
plot(coastsCoarse, add=TRUE, col='blue')

plot(r_s2, main = "Another realisation", zlim = zrange)
points(df)
plot(coastsCoarse, add=TRUE, col='blue')

plot(r_s3, main = "And another", zlim = zrange)
points(df)
plot(coastsCoarse, add=TRUE, col='blue')

How does the distribution of observations match the uncertainty? $~$

$~$

Including spatial trends

We can also look at “anisotropy” in the variogram - are things the same in all directions? We use the variog4 function to do this automatically, looking in four directions (0 = N-S; 45 = NE-SW; 90 = E-W; 135 = SE-NW).

par(mfrow = c(1, 1))
vgmDir <- variog4(dg)

variog: computing variogram for direction = 0 degrees (0 radians)
        tolerance angle = 22.5 degrees (0.393 radians)
variog: computing variogram for direction = 45 degrees (0.785 radians)
        tolerance angle = 22.5 degrees (0.393 radians)
variog: computing variogram for direction = 90 degrees (1.571 radians)
        tolerance angle = 22.5 degrees (0.393 radians)
variog: computing variogram for direction = 135 degrees (2.356 radians)
        tolerance angle = 22.5 degrees (0.393 radians)
variog: computing omnidirectional variogram

plot(vgmDir)

What do you conclude from this? $~$

$~$

The observations have clear spatial trends, the largest running SE to NW. It may help to remove the large-scale spatial trend, and krige the residuals. We can do this by adding trend terms in the model.control section of krige.bayes, for both the observations (trend.d for “data”) and prediction (trend.l for “locations”).

   user  system elapsed 
  25.72    0.14   25.89

Again, we can plot the posterior distribution of the three variogram parameters

# Now we do the analysis with east*north trends:
par(mfrow = c(2, 2))
plot(bsp)
plot(density(bsp$posterior$sample$sigmasq), main = "", xlab = expression(sigma^2))
vgm  <- variog(dg, trend = "1st")

variog: computing omnidirectional variogram

plot(vgm)

and extract the parameter posterior means and standard deviations. $\beta$ is no longer the mean, but the regression coefficients for $x$, $y$ and $xy$.

colMeans(bsp$posterior$sample)

       beta0        beta1        beta2      sigmasq          phi    tausq.rel 
12.755659501  0.004367038 -0.132102850  0.802369697 14.400000000  0.334500000

apply(bsp$posterior$sample, 2, sd)

     beta0      beta1      beta2    sigmasq        phi  tausq.rel 
 1.8697138  0.0289370  0.0345834  0.5809820 15.7998849  0.2241364

Has this improved the variogram estimation? $~$

$~$

Plotting the new predictions, we can see they have not changed dramatically.

r_mean  <- setValues(r, bsp$predictive$mean)
r_var  <- setValues(r, bsp$predictive$variance)
r_s1 <- setValues(r, bsp$predictive$simulations[,2])
r_s2 <- setValues(r, bsp$predictive$simulations[,3])
zrange <- c(min(df$z), max(df$z))

par(mfrow=c(1,2))
plot(r, main = "Observations", zlim = zrange)
plot(coastsCoarse, add=TRUE, col='blue')

plot(r_mean, main = "Predictions", zlim = zrange)
points(df)
plot(coastsCoarse, add=TRUE, col='blue')
# add in two test sites too
spdf_2testSites <- as.data.frame(grid_coords_2testSites)
coordinates(spdf_2testSites) <- ~V1 + V2
projection(spdf_2testSites) <- projlonlat
points(spdf_2testSites, col='red', pch = 15, cex = 1.5)

Lastly, we can look at two contrasting sites, Edinburgh and Minsk, the red squares in the map above. Minsk (in Belarus) is outwith the area covered by observations.

   user  system elapsed 
   1.59    0.16    1.75

[1] 5.625038 4.971365

[1] 0.3071959 0.7859604

$~$

$~$ Is it clear why the uncertainties are different?

Things to explore:

Can you find a way to decrease the uncertainties?
Try different:
- variogram models
- prior distributions
- spatial trends

How would you estimate the uncertainties in a non-Bayesian way?

How would you incorporate the uncertainty in the radiocarbon dates themselves?