Environmental Data Science Toolbox

natcapuk-logo abstract-environmental-data-science

Environmental Data Science Toolbox#

This is a prototype version of the National Capability UK (NC-UK) Environmental Data Science Toolbox, hosted by the UK Centre for Ecology & Hydrology (UKCEH). The aim is to apply FAIR principles (Findable, Accessible, Interoperable, and Reusable) to a collection of data science methods that are generalizable across different environmental applications, with a focus on integrative modelling. The hope being that this will encourage cross-disciplinary use of methods, enhancing national environmental research.

If you’re interested in contributing to this project it would be great to hear from you and you can find details of how to do so via the CONTRIBUTING.md page in the root of the repository. 🌞

The current recommended workflow for interactively engaging with the code in the methodology notebooks is to clone the Notebook Repository linked at the top of each notebook to get access to the relevant files and then to create a virtual environment and test running different sections of the code in your favourite IDE, such as VS Code.

Methods

Key Concepts

Key Datasets

Bias Correction of Climate Models Ongoing Development

Gaussian Processes, Bayesian Hierarchical Modelling

Climate Model Output, In-situ Weather Station Measurements

Calculating Risk to Terrestrial Carbon Pool Ongoing Development

Data Access, Data Integration

MODIS Land Cover and Net Primary Production Products, European Space Agency (ESA) Climate Change Initiative (CCI) Soil Moisture Dataset, Global Standardized Precipitation-Evapotranspiration Index (SPEI) Dataset.

Understanding the error of Multispecies Biodiversity Indicators Ongoing Development

Bias, Uncertainty

Simulated Dataset (Multispecies Occupancy).

Joint Species Distribution Models with jsdmstan

Stochastic Partial Differential Equations, Integrated Nested Laplace Approximations,

Simulated Dataset (Multispecies Populations).

Accessing EA Data via an API Planned

Data Access, Data Integration

Data Pipelines for JULES Emulation/Portable JULES Planned

River Utility Tools Planned

Spatio-temporal Integration, Networks

CSV File Checker Planned

Data Quality, Data Integrity

Spatio-temporal Data Integration with INLA Tentative

Spatio-temporal Integration, Bayesian

Understanding and Modelling Spatio-temporal Lags along Networks Tentative

Spatio-temporal Integration, Networks

State Tagging for Environmental Data QA Tentative

Data Quality, Data Integrity