Spatial Data

Bayesian Methods for Ecological and Environmental Modelling

Author

Peter Levy

Published

March 22, 2024

library(ggplot2)
library(rworldmap)
Loading required package: sp
### Welcome to rworldmap ###
For a short introduction type :      vignette('rworldmap')
library(raster)
library(geoR)
--------------------------------------------------------------
 Analysis of Geostatistical Data
 For an Introduction to geoR go to http://www.leg.ufpr.br/geoR
 geoR version 1.9-2 (built on 2022-08-09) is now loaded
--------------------------------------------------------------
set.seed(448)

#Define geographic projections to be used
# lat / lon 
projlonlat <- CRS("+proj=longlat +datum=WGS84")

Introduction

In this practical session, we will look at real examples of applying the Bayesian approach to statistical spatial modelling. The field of spatial modelling has developed its own terminology which can be confusing, but the gist is simple. We have observations of a variable \(z\) at a number of locations. To estimate or predict \(z\) at a new location, we could use the mean of the observations weighted by their proximity (i.e.by the inverse of distance from the new location). Nearby points have a strong influence, far-away points less so.

A more sophisticated approach is to model how proximity actually influences the weights, based on the observed data. This model is known as a (semi)variogram - for every pair of points, we calculate the distance separating them and the difference in \(z\) (expressed as half of the squared difference). Often, nearby points tend to be similar; as distance between points increases, they are generally less similar, so the semivariance increases to an asymptote, which is the overall variance in \(z\).

Variogram model showing the typical increase in dissimilarity as distance between points increases. The commonly used terms for the parameters of such a model are shown.

Using the weights calculated from a fitted variogram model to predict values on a grid around the observations is known as “kriging”, and is a standard way of interpolating spatial data. Conventional kriging fits the variogram model by least-squares, and then treats this as a fixed, known quantity. The Bayesian approach recognises that we do not know the true variogram model parameters, but estimates their posterior probability distribution, given the data. We can then sample many possible realisations of the variogram parameters, and thereby produce many possible realisations of the predictions (the interpolated values on a grid). Considering the variance among these realisations gives us a good estimate of uncertainty in the predicted values of \(z\), and potentially better estimate of the mean.

We now consider two examples of of applying Bayesian kriging to real spatial estimation problems. We use the the geoR package in R to do this. This allows us to use (hopefully) familar R function syntax, and saves us having to write out the whole geostatistical model in JAGS.

Case study 1: Spread of Agriculture in the Neolithic Period

Around 8000 years ago, human society made the transition from hunter-gathering to growing domesticated crops - the start of agriculture. Radiocarbon data from agricultural artefacts can give an estimate of the earliest date for agriculture at archaeological sites. Isern et al. [-@Isern2012] collate such a data set, and infer the pattern in the spread of agriculture in Neolithic times. Here, we reanalyse this data, using Bayesian kriging to provide estimates of uncertainty.

# First, we load the data, put it in an empty raster and plot it.
load(url("https://raw.githubusercontent.com/NERC-CEH/beem_data/main/neolithic.RData"), verbose = TRUE)
Loading objects:
  df
data(coastsCoarse)
r <- raster(ext = extent(df), nrows = 20, ncols = 20)
r <- rasterize(df, r, field = "CALBP")
plot(r)
plot(coastsCoarse, add=TRUE, col='blue')
points(df)
title(main = "Date of earliest agriculture", sub = "000 years Before Present")

grid_coords <- coordinates(r) # save the xy locations for geoR
edinburgh <- c(-3, 56)        # a location for prediction in lon, lat
minsk     <- c(29, 56)        # a location for prediction in lon, lat
grid_coords_2testSites <- rbind(edinburgh, minsk)

The \(z\) values show the earliest date of agriculture at 100 sites, in 000s of years before present (8 ka BP = 8000 years ago).

Is there a spatial pattern in these data? How would you interpret it?

We convert the data to a “geodata” object used by the geoR package, and use its built-in summary and plot routines.

# df is the common generic data frame name
# dg is my analagous generic geodata object name
dg <- as.geodata(df, data.col = 4) # date "CALBP" is in the fourth column
summary(dg)
Number of data points: 100 

Coordinates summary
    LONGITUDE LATITUDE
min     -6.21    35.51
max     30.18    59.35

Distance summary
        min         max 
 0.09219544 38.38451901 

Data summary
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
3.61100 5.73025 6.49450 6.52259 7.25125 8.76900 
plot(dg)

What features do you notice in the data? \(~\)



\(~\)

Construct a variogram

We construct a variogram using the geoR variog() function. This creates bins for separation distance and calculates the semivariance in each.

vgm <- variog(dg)
variog: computing omnidirectional variogram
plot(vgm)

Does the variogram look reasonable? \(~\)

\(~\)

Examine the vgm object. vgm$n gives the number of points in each distance bin. The greatest separation distances have fewest points, so carry less weight. We will use the krige.bayes function to estimate the parameters of a vargiogram model, and to predict values on the regular grid shown in the first figure.

Specifying priors

First, we need to specify the prior distributions for the \(\phi\), \(\tau^2\) and \(\sigma^2\) parameters (and also for \(\beta\), the mean of \(z\), although this is not a spatial parameter - yet).

geoR does not use MCMC to estimate parameters, but an approximation. One of the numerical tricks it uses is to discretise some of the prior distributions to make computation easier. We do this here for \(\phi\) and \(\tau^2\):

  • range \(\phi\) - we assume a uniform distribution between 0 and twice the maximum in the data
  • intercept \(\tau^2\) - this is specified as a fraction of the asymptote \(\sigma^2\), “tausq.rel”. We assume a uniform distribution between 0 and 0.9.
  • asymtote \(\sigma^2\) - we assume a “reciprocal” prior, where larger values become diminishingly probable, in inverse proportion to \(\sigma^2\)
  • mean \(\beta\) - we assume a uniform distribution between 0 and infinity i.e. completely uninformative; “flat” in the geoR syntax.

The lines below specify these distributions, which are plotted below.

maxdist  <- vgm$max.dist
seqphi_sparse <- seq(0, 2*maxdist, by = 5)
seqtau_sparse <- seq(0, 0.9, by=0.05)

Estimate the vargiogram model parameters

Now we use the krige.bayes function to estimate the variogram parameters. We do not make predictions over the grid yet, to make the computation quick.

system.time(
bsp <- krige.bayes(
  dg,
  loc="no",
  prior=prior.control( 
    beta.prior="flat",
    sigmasq.prior="reciprocal",
    phi.prior="uniform", 
    phi.discrete=seqphi_sparse,
    tausq.rel.discrete = seqtau_sparse,
    tausq.rel.prior="uniform"),
  output=output.control( n.posterior=10000, messages=FALSE)
  )
)
   user  system elapsed 
   0.81    0.11    0.92 

The plot below shows the mean and median of the posterior distribution, plus some limits on the envelope of the distribution.

par(mfrow = c(1, 1))
vgm_lfit <- likfit(dg, ini = c(0.5, 0.5), fix.nugget = FALSE, messages = FALSE)
vgm <- variog(dg)
variog: computing omnidirectional variogram
vgm_env <- variog.model.env(dg, obj.v = vgm,
                             model.pars = vgm_lfit)
variog.env: generating 99 simulations (with  100 points each) using the function grf
variog.env: adding the mean or trend
variog.env: computing the empirical variogram for the 99 simulations
variog.env: computing the envelops
plot(vgm, env = vgm_env)
lines(bsp, summary.posterior = mean)
lines(bsp, summ = median, lty = 2)
#lines(bsp, summ = "mode", post = "par", lwd = 2, lty = 2)
legend(0.15, 5, legend = c("posterior mean",
  "posterior median"), lty = c(1, 2), lwd = c(1, 1), cex = 0.8)

We can now plot the posterior distribution of the four parameters

par(mfrow = c(2, 2))
plot(bsp)
#str(bsp$predictive)
plot(density(bsp$posterior$sample$sigmasq), main = "", xlab = expression(sigma^2))
plot(density(bsp$posterior$sample$beta), main = "", xlab = expression(beta))

and extract the parameter posterior means and standard deviations.

colMeans(bsp$posterior$sample)
     beta   sigmasq       phi tausq.rel 
 6.392279  3.023315 33.953000  0.094825 
apply(bsp$posterior$sample, 2, sd)
       beta     sigmasq         phi   tausq.rel 
 1.33462375  2.81171818 19.90146723  0.08589067 

Can you relate these values back to the variogram? Are we reasonably confident in the estimated variogram? \(~\)

\(~\)

Make predictions

Next we make our posterior predictions over the grid covering the whole of Europe. This is more time-consuming, so we produce only 100 samples from the posterior, instead of 10000 as above.

system.time(
bsp <- krige.bayes(
  dg,
  loc=grid_coords,
  prior=prior.control( 
    beta.prior="flat",
    sigmasq.prior="reciprocal",
    phi.prior="uniform", 
    phi.discrete=seqphi_sparse,
    tausq.rel.discrete = seqtau_sparse,
    tausq.rel.prior="uniform"),
  output=output.control( n.posterior=100, messages=FALSE)
  )
)
   user  system elapsed 
  29.38    0.35   29.75 
r_mean  <- setValues(r, bsp$predictive$mean)
r_var  <- setValues(r, bsp$predictive$variance)
r_s1 <- setValues(r, bsp$predictive$simulations[,2])
r_s2 <- setValues(r, bsp$predictive$simulations[,3])
r_s3 <- setValues(r, bsp$predictive$simulations[,1])
zrange <- c(min(df$z), max(df$z))

par(mfrow=c(1,2))
plot(r, main = "Observations", zlim = zrange)
plot(coastsCoarse, add=TRUE, col='blue')

plot(r_mean, main = "Predictions", zlim = zrange)
points(df)
plot(coastsCoarse, add=TRUE, col='blue')

Do the predictions from Bayesian kriging interpolate in a reasonable way? \(~\)

\(~\)

Critically, Bayesian kriging gives us estimates of uncertainty in the interpolation. The variance plot below shows where we are confident of predictions and where we are not. This comes from looking across the possible realistions of mapped predictions in our posterior distribution. Three samples (realistions) are shown below.

par(mfrow=c(1,2))
plot(r_var, main = "Variance")
points(df)
plot(coastsCoarse, add=TRUE, col='blue')

plot(r_s1, main = "One realisation", zlim = zrange)
points(df)
plot(coastsCoarse, add=TRUE, col='blue')

plot(r_s2, main = "Another realisation", zlim = zrange)
points(df)
plot(coastsCoarse, add=TRUE, col='blue')

plot(r_s3, main = "And another", zlim = zrange)
points(df)
plot(coastsCoarse, add=TRUE, col='blue')

How does the distribution of observations match the uncertainty? \(~\)

\(~\)

Things to explore:

  • Can you find a way to decrease the uncertainties?
  • Try different:
    • variogram models
    • prior distributions
    • spatial trends

How would you estimate the uncertainties in a non-Bayesian way?

How would you incorporate the uncertainty in the radiocarbon dates themselves?

References