Section 2 Methods

2.1 Mathematical representation of land use and land-use change

We begin by describing the data stuctures which are used to represent land use and its change over time, \(\mathbf{U}\), \(\mathbf{A}\), \(\mathbf{B}\), \(\mathbf{G}\), and \(\mathbf{L}\), as in Levy et al. (2018).

2.1.1 Notation

We use the mathematical convention of representing vectors, matrices and arrays as uppercase bold (e.g. \(\mathbf{U}\)), and individual elements thereof as uppercase italic (e.g. \(U_{xyt}\)). Objects from R code are shown in sans-serif typeface, e.g. s_U[[t]][x,y]. The R code follows the mathematical notation defined in the journal paper, and uses a consistent naming convention, based on a few rules. Details of the notation are given in the GitHub documentation. In short, the maths are written in LaTeX, and the R code tries to mirror this.

2.1.2 \(\mathbf{U}\)

The AFOLU Guidance (IPCC 2006) recommends use of six types of land for broad descriptive purposes: forest, grassland, cropland, settlements, wetlands and other land. However, the area of grassland in the UK is very large, and heterogeneous in terms of soil carbon. For the purposes of modelling soil carbon in the current GHGI, grassland is subdivided into improved grassland and semi-natural/rough grazing land, on the basis of aggregating classes in the CORINE classification. The soil carbon parameters associated with these two grassland types are given by Bradley et al. (2006).Wetlands in the UK are very small, consisting only of those areas undergoing active commercial peat extraction, areas of inland water and flooded land. GHG emissions associated with peat extraction are calculated separately, and emissions associated with transitions to/from wetlands are considered negligible.

Here, we represent land use \(u\) as a number of discrete states from the set \(\{\mathrm{woods, crops, grass, rough, urban, other}\}\), encoded as integers 1-6. “Woods” means all forested land, “grass” is short for improved grassland, and “rough” is short for rough grazing and semi-natural land.

At a single location (x,y), land use can change between these states over time, represented by the vector \(\mathbf{U}_{xy}\). Spatially, we represent land use on a grid, where each grid cell contains a vector of land use. Combining the spatial and temporal dimensions, we have the 3-D space-time array \(\mathbf{U} = \{U_{xyt}\}\).

2.1.3 \(\mathbf{A}\)

We denote the area occupied by each land use \(u\) at time \(t\) as \(A_{ut}\), obtained by counting the frequency of land uses in \(\mathbf{U}_t\):

\[A_{ut} = \sum_{x = 1}^{n_x} \sum_{y = 1}^{n_y} [U_{xyt} = u] A_{\mathrm{grid cell}}\]

where the square brackets are Iverson notation, evaluating to 1 where true and zero otherwise, and \(A_\mathrm{grid cell}\) is the area of a single grid cell. We denote the set of all these areas as \(\mathbf{A} = \{A_{ut}\}\). By difference, we obtain the areas of net land-use change:

\[\Delta A_{ut} = A_{ut} - A_{ut-1}\].

2.1.4 \(\mathbf{B}\)

At each time step, we have a square transition matrix, which we collect in the 3D array \(\mathbf{B}\):

\[\mathbf{B} = \begin{bmatrix} 0 & \beta_{12} & \beta_{13} & \dots & \beta_{1n_{u}} \\ \beta_{21} & 0 & \beta_{23} & \dots & \beta_{2n_{u}} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \beta_{n_{u}1} & \beta_{n_{u}2} & \beta_{n_{u}3} & \dots & 0 \end{bmatrix}_{t=1} \begin{bmatrix} 0 & \beta_{12} & \beta_{13} & \dots & \beta_{1n_{u}} \\ \beta_{21} & 0 & \beta_{23} & \dots & \beta_{2n_{u}} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \beta_{n_{u}1} & \beta_{n_{u}2} & \beta_{n_{u}3} & \dots & 0 \end{bmatrix}_{t=2} \dots \begin{bmatrix} 0 & \beta_{12} & \beta_{13} & \dots & \beta_{1n_{u}} \\ \beta_{21} & 0 & \beta_{23} & \dots & \beta_{2n_{u}} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \beta_{n_{u}1} & \beta_{n_{u}2} & \beta_{n_{u}3} & \dots & 0 \end{bmatrix}_{t=n_t} \]

where the elements represent the gross area changing from one land use to another that year, and there are \(n_u\) possible land uses. For example, \(\beta_{23}\) in \(\mathbf{B}_{t=2}\) is the area changing from land-use type 2 to land-use type 3 between years 1 and 2. The transition matrix at time \(t\) can be derived from the array layer \(\mathbf{U}_t\) by comparison with the previous layer at \(t-1\). Each element is given by

\[\beta_{ij} = \sum_{x = 1}^{n_x} \sum_{y = 1}^{n_y} [U_{xy t-1} = i \land U_{xyt} = j] A_\mathrm{grid cell}\].

2.1.5 \(\mathbf{G}\) and \(\mathbf{L}\)

For a given time step, the net change in the area occupied by each land use is given by the gross gains (the vector of column sums, \(G_{ut}\)) minus the gross losses (the vector of row sums, \(L_{ut}\)):

\[G_{ut} = \sum_{i = 1}^{n_u} \beta_{iu} \quad \mathrm{for \, time} \, t\] \[L_{ut} = \sum_{j = 1}^{n_u} \beta_{uj} \quad \mathrm{for \, time} \, t\]

where \(i\) and \(j\) are the row and column indices, respectively, and

\[\Delta A_{ut} = G_{ut} - L_{ut}\].

2.2 Data Assimilation

Almost all of the information on the history of land use can be expressed in the form of \(\mathbf{U}\), \(\mathbf{B}\), \(\mathbf{A}\), \(\mathbf{G}\), and \(\mathbf{L}\), and the equations above tell us how these are inter-related. \(\mathbf{U}\) contains complete information about the system. \(\mathbf{B}\) contains partial information about the system, which can be summarised in the form of \(\mathbf{A}\), \(\mathbf{G}\), and \(\mathbf{L}\), but does not directly specify \(\mathbf{U}\). In themselves, \(\mathbf{A}\), \(\mathbf{G}\), and \(\mathbf{L}\) do not directly specify either \(\mathbf{U}\) or \(\mathbf{B}\), but can be used as constraints in their estimation.

Observations from the data sources described in subsequent sections provide information in the form of one or more of the data structures described above. Our approach here is to use the above equations as a simple model to relate the different observations via data assimilation in a two-stage process. Firstly, we use a Bayesian approach to estimate the parameters in \(\mathbf{B}\), given all the available information. Secondly, we use the posterior distribution of \(\mathbf{B}\) and likelihood for the location of land use and land-use change to estimate the posterior distribution of \(\mathbf{U}\). The maximum a posteriori probability (MAP, the mode of the posterior distribution) realisations represent our best estimate of land use and land-use change, given the available data.

We refer to this as a “Bayesian data assimilation” procedure. The term “data assimilation” is usually applied to methods which use observed data to update the state variables of a model; terms such as “calibration” and “inversion” are usually applied to methods which use observed data to update the parameters of a model. The distinction is not very important, as the underlying mathematics is essentially the same, and particularly here where the distinction between parameters and states is itself sometimes blurred. We use the term “data assimilation” because it better conveys the idea that observations are brought together to form something greater than the sum of their parts. The procedure is Bayesian because it updates the parameters according to Bayes Theorem, i.e. using prior knowledge and an explicit calculation of the probability of the data given the parameters (the likelihood). This has the advantages that it provides a robust means of estimating uncertainty on the estimates, and it allows us to combine data from different sources, as well as informative priors.

2.3 Rationale for two-stage procedure

The ultimate problem is to estimate the state of the 3-D space-time array \(\mathbf{U} = \{U_{xyt}\}\) using the available observations from different data sources. If we are only interested in the best estimate of land use at a given location and time, the assimilation of data is relatively simple. We want to estimate \(p[\hat{u}|\textbf{u}]\), that is, the probability of an estimated land-use state \(\hat{u}\) being correct, given a vector of observations \(\textbf{u}\). For example, with a vector of observations \(\textbf{u} = {"crop", "crop", "grass"}\) from three data sources (e.g. LCM, IACS and CORINE), we might estimate the probability that the true land-use state is “crop” to be 2/3. If we know the precision of these different observations, we can adjust this accordingly.

However, if we are interested in land-use change at national scale, the problem is more complex. The crux of this is that calculating the probability of correctly estimating a change to a new state \(u^{new}\) at location \(xy\) is no longer independent at each location. For example, the prediction of whether a particular forest changes to grassland at time \(t\) depends on how many other forests we predict to change to grassland elsewhere. Put mathematically, the probability of correctly estimating a change to the new state \(p[\hat{u}^{new}_{xy}|\textbf{u}_{xy}]\) becomes dependent on the estimates of the \(B\) matrix, so

\[\begin{equation} \label{eq:bayes1} p[\hat{u}^{new}_{xy}| \textbf{u}_{xy}, \textbf{B}] \end{equation}\]

But similarily, \(\textbf{B}\) is dependent on the areas changing state. This inter-dependence of the two key unknowns makes it difficult to solve in a single step. For this reason, we split the problem into two parts: firstly, we estimate the \(\textbf{B}\) matrix using all available data; secondly, given the constraint of this matrix, we estimate the locations of land-use change. These two problems are different in nature, and accordingly, we solve them in different ways: the former we solve using MCMC; the latter we solve using a form of importance sampling.

}

References

Levy, P., M. van Oijen, G. Buys, and S. Tomlinson. 2018. “Estimation of Gross Land-Use Change and Its Uncertainty Using a Bayesian Data Assimilation Approach.” Biogeosciences 15 (5): 1497–1513. https://doi.org/10.5194/bg-15-1497-2018.