Model Usage • elements

Retrieving models

library(e1071)
library(elements)
library(filehash)

Due to the total size of the 6177 ENMs currently included in elements (1.8GB when compressed, 7.5GB in memory) the ENMs are not exported in a .rda object. Instead they are made available through a filehash (Peng, 2005) database, which provides access to the ENMs without loading all models into memory. To access the ENMs a connection to this database must be initialised using elements::startup. As mentioned above the Github repository does not include the “./inst/extdata/Models” object containing all the ENMs, the elements::startup will check whether the “./inst/extdata/Models” is present and if it is not found will load the “./inst/testdata/TestModels” models instead. The models to load can also be accessed by passing “all” or “test” to the ‘models’ argument of elements::startup.

elements::startup()

model <- Models[["stellaria_graminea"]]

#> 
#> Call:
#> svm(formula = Presence ~ L + M + N + R + S + SD + GP + bio05 + bio06 + 
#>     bio16 + bio17, data = data, type = "C-classification", probability = TRUE)
#> 
#> 
#> Parameters:
#>    SVM-Type:  C-classification 
#>  SVM-Kernel:  radial 
#>        cost:  0.4 
#> 
#> Number of Support Vectors:  13481

Using the models

The raw ENMs retrieved using the method above can be used as regular e1071 SVM model objects. Alternatively, the helper function elements::predict_occ_taxon retrieves a model using the method above, generates predictions, and formats the results as a data frame.

results <- elements::predict_occ_taxon(taxon = "stellaria_graminea", predictors = elements::ExampleData1, pa = "Present", limit = NULL, dp = 2, append_predictors = FALSE)

#>   Present
#> 1    0.89
#> 2    0.02
#> 3    0.28
#> 4    0.55
#> 5    0.04
#> 6    0.40

An additional helper function elements::predict_occ can generate predictions for multiple taxa, by either specifing the taxa to model in the ‘taxa_codes’ argument, or by setting ‘taxa_codes’ to NULL and including an additional column in the predictors data frame named ‘taxon_code’.

results <- elements::predict_occ(taxa_codes = NULL, predictors = elements::ExampleData2, pa = "Present", limit = NULL, dp = 2, append_predictors = FALSE)

#>     Present         taxon_code
#> 201    0.00 silene_flos-cuculi
#> 202    0.00 silene_flos-cuculi
#> 203    0.00 silene_flos-cuculi
#> 204    0.46 silene_flos-cuculi
#> 205    0.00 silene_flos-cuculi
#> 206    0.01 silene_flos-cuculi

Two helper arguments provide additional functionality in controlling model use. First, is the ‘limit’ argument, which assigns probability values of zero if one or more predictors are outside a specified range e.g. the 10% and 90% quantiles (see elements::NicheWidths). Second, is the ‘holdopt’ argument, which holds specified variable values at their optima (as defined by the mean value present in elements::NicheWidths).

As a simple demonstration, below two sets of predictions for Stellaria graminea are generated, holding all variables apart from N at their optima: 1) with no limit set, and 2) with a limit set to the 1% and 99% quantiles.

n_gradient <- data.frame("N" = seq(0, 10, 0.01))

vary_N_no_limit <- elements::predict_occ_taxon(taxon = "stellaria_graminea", predictors = n_gradient,
                                               pa = "Present", limit = NULL, holdopt = c("bio05", "bio06", "bio16", "bio17", "GP", "L", "M", "R", "S", "SD"),
                                               dp = 2, append_predictors = TRUE)

vary_N_limit <- elements::predict_occ_taxon(taxon = "stellaria_graminea", predictors = n_gradient,
                                            pa = "Present", limit = "q01_q99", holdopt = c("bio05", "bio06", "bio16", "bio17", "GP", "L", "M", "R", "S", "SD"),
                                            dp = 2, append_predictors = TRUE)

Please note that as ten out of the eleven variables are held at their optima the predicted probabilities will be high as the influence of unsuitable N values will be partially offset. Consequently, the shape of the response curves above will be wider than the corresponding PDP plot produced with the elements::plot_me function (see the Model Inspection section below).

Scenarios

elements is designed to be used to model scenarios of environmental change considering multiple interacting drivers. The object elements::ExampleScenarios provides a basic set of example scenarios: (A) Climate Change - RCP4.5, (A) Climate Change - RCP8.5, (B) Grazing Intensification (+0.025GP per year), (B) Grazing Reduction (-0.025GP per year), and (C) Nutrient Enrichment (+0.25N per year) along with a Baseline scenario. Below the predicted probabilities for the taxa in elements::ExamplePlot for Scenario C are displayed.

scenario_c_results <- elements::predict_occ(taxa_codes = elements::ExamplePlot$taxon_code,
                                            predictors = subset(x = elements::ExampleScenarios, scenario_code == "c"),
                                            append_predictors = TRUE)

Environmental filtering

elements can also be used to filter species pools based on a given set of predictor values using the function elements::env_filter. Two sets of methods are available: 1) “svm” which generates predictions using elements::predict_occ and uses the resultant probability values; and 2) “mean” and “median” which calculates the normalised euclidean distance between the values supplied in the ‘predictors’ argument and the mean or median niche positions as present in elements::NicheWidths.

For example, below elements::env_filter is applied to all taxa in elements::TaxonomicBackbone using the svm method, with the predictors derived from the baseline environmental variable data from elements::ExamplePlot as present in elements::ExampleScenarios[1,].

filter_results_svm <- elements::env_filter(predictors = elements::ExampleScenarios[1,], taxa = elements::TaxonomicBackbone$taxon_code, method = "svm")

#>                         taxon_code Present rank
#> 1                 comarum_palustre   0.991    1
#> 2             hydrocotyle_vulgaris   0.982    2
#> 3         eriophorum_angustifolium   0.967    3
#> 4            menyanthes_trifoliata   0.965    4
#> 5                      carex_nigra   0.962    5
#> 6              ranunculus_flammula   0.958    6
#> 7             equisetum_fluviatile   0.952    7
#> 8               epilobium_palustre   0.950    8
#> 9           lysimachia_thyrsiflora   0.949    9
#> 10                  carex_rostrata   0.944   10
#> 11                 carex_canescens   0.933   11
#> 12      salix_repens_subsp._repens   0.933   12
#> 13             stellaria_palustris   0.933   13
#> 14 galium_palustre_subsp._palustre   0.928   14
#> 15                 viola_palustris   0.925   15
#> 16          calliergon_cordifolium   0.914   16
#> 17                  juncus_effusus   0.910   17
#> 18      potamogeton_polygonifolius   0.908   18
#> 19                  carex_echinata   0.898   19
#> 20         calamagrostis_canescens   0.893   20

NOTE: The mean and median methods are only included for demonstrative purposes only and should not be used in practice as they do not consider the joint distribution of variables as expressed through the SVM model hypervolumes.

Shutting down

At the end of the analyis run elements::shutdown to close the connection to the filehash database.

elements::shutdown()