Skip to contents

Retrieving Models

Due to the total size of the 6631 ENMs currently included in elements (1.8GB when compressed, 7.5GB in memory) the ENMs are not exported in a .rda object. Instead they are made available through a filehash (Peng, 2005) database, which provides access to the ENMs without loading all models into memory. To access the ENMs a connection to this database must be initialised using elements::startup. As mentioned above the Github repository does not include the “./inst/extdata/Models” object containing all the ENMs, the elements::startup will check whether the “./inst/extdata/Models” is present and if it is not found will load the “./inst/testdata/TestModels” models instead. The models to load can also be accessed by passing “all” or “test” to the ‘models’ argument of elements::startup.

elements::startup()
#> elements startup completed.

model <- elementsEnv$Models[["stellaria_graminea"]]
#> 
#> Call:
#> svm(formula = Presence ~ L + M + N + R + S + SD + GP + bio05 + bio06 + 
#>     bio16 + bio17, data = data, type = "C-classification", probability = TRUE)
#> 
#> 
#> Parameters:
#>    SVM-Type:  C-classification 
#>  SVM-Kernel:  radial 
#>        cost:  0.6 
#> 
#> Number of Support Vectors:  12724

Generating Predictions

The raw ENMs retrieved using the method above can be used as regular e1071 SVM model objects. Alternatively, the helper function elements::predict_occ_taxon retrieves a model using the method above, generates predictions, and formats the results as a data frame.

results <- elements::predict_occ_taxon(taxon = "stellaria_graminea", predictors = elements::ExampleData1, pa = "Present", limit = NULL, dp = 2, append = "ids")
#>   Present
#> 1    0.30
#> 2    0.00
#> 3    0.00
#> 4    0.70
#> 5    0.01
#> 6    0.08

An additional helper function elements::predict_occ can generate predictions for multiple taxa, by either specifing the taxa to model in the ‘taxa’ argument, or by setting ‘taxa’ to NULL and including an additional column in the predictors data frame named ‘taxon_code’.

results <- elements::predict_occ(taxa = NULL, predictors = elements::ExampleData2, pa = "Present", limit = NULL, holdopt = NULL, dp = 2, append = "ids")
#>             taxon_code Present
#> 201 silene_flos-cuculi    0.70
#> 202 silene_flos-cuculi    0.00
#> 203 silene_flos-cuculi    0.12
#> 204 silene_flos-cuculi    0.00
#> 205 silene_flos-cuculi    0.00
#> 206 silene_flos-cuculi    0.00

Two helper arguments provide additional functionality in controlling model use. First, is the ‘limit’ argument, which assigns probability values of zero if one or more predictors are outside a specified range e.g. the 10% and 90% quantiles (see elements::NicheWidths). Second, is the ‘holdopt’ argument, which holds specified variable values at their optima (as defined by the mean value present in elements::NicheWidths).

As a simple demonstration, below two sets of predictions for Stellaria graminea are generated, holding all variables apart from N at their optima: 1) with no limit set, and 2) with a limit set to the 1% and 99% quantiles.

n_gradient <- data.frame("N" = seq(0, 10, 0.01))

vary_N_no_limit <- elements::predict_occ_taxon(taxon = "stellaria_graminea", predictors = n_gradient,
                                               pa = "Present", limit = NULL, holdopt = c("bio05", "bio06", "bio16", "bio17", "GP", "L", "M", "R", "S", "SD"),
                                               dp = 2, append = "predictors")

vary_N_q01_q99 <- elements::predict_occ_taxon(taxon = "stellaria_graminea", predictors = n_gradient,
                                              pa = "Present", limit = "q01_q99", holdopt = c("bio05", "bio06", "bio16", "bio17", "GP", "L", "M", "R", "S", "SD"),
                                              dp = 2, append = "predictors")

Please note that as ten out of the eleven variables are held at their optima the predicted probabilities will be high as the influence of unsuitable N values will be partially offset. Consequently, the shape of the response curves above will be wider than the corresponding PDP plot produced with the elements::plot_me function (see the Model Inspection).

Environmental filtering

elements can also be used to filter species pools based on a given set of predictor values using the function elements::env_filter. Two sets of methods are available: 1) “svm” which generates predictions using elements::predict_occ and uses the resultant probability values; and 2) “mean” and “median” which calculates the normalised euclidean distance between the values supplied in the ‘predictors’ argument and the mean or median niche positions as present in elements::NicheWidths.

NOTE: The mean and median methods are only included for demonstrative purposes only and should not be used in practice as they do not consider the joint distribution of variables as expressed through the SVM model hypervolumes.

The option to apply the elements::envelope_filter function, which first screens the supplied taxa to check whether the predictor values are within a given range as supplied to the ‘limit’ argument (“min_max”, “q01_q99”, “q05_q95”, “q10_q90”, “q25_q75”), is controlled by the ‘screen’ argument; elements::env_filter will then only run the more computationally expensive elements::predict_occ or elements::calc_distance functions for the taxa within the limits. This is highly recommended as it usually results in a 95% reduction in the number of taxa being supplied to elements::predict_occ or elements::calc_distance, greatly improving performance.

For example, below elements::env_filter is applied to all taxa in elements::TaxonomicBackbone using the svm method, with the predictors derived from elements::ExampleScenarios.

filter_results <- elements::env_filter(predictors = elements::ExampleScenarios[1,],
                                       taxa = elements::TaxonomicBackbone$taxon_code, 
                                       screen = TRUE, method = "svm", limit = "min_max", 
                                       exclude = NULL, threshold = NULL, 
                                       append = "ids")
#>    scenario timeslice scenario_code                              taxon_code
#> 1  Baseline      2007             a                              abies_alba
#> 2  Baseline      2007             a                    abietinella_abietina
#> 3  Baseline      2007             a                          acer_campestre
#> 4  Baseline      2007             a                            acer_negundo
#> 5  Baseline      2007             a                        acer_platanoides
#> 6  Baseline      2007             a                     acer_pseudoplatanus
#> 7  Baseline      2007             a                    achillea_millefolium
#> 8  Baseline      2007             a achillea_millefolium_subsp._millefolium
#> 9  Baseline      2007             a                       achillea_ptarmica
#> 10 Baseline      2007             a                       aconitum_napellus
#>    Present
#> 1    0.004
#> 2    0.000
#> 3    0.001
#> 4    0.000
#> 5    0.000
#> 6    0.010
#> 7    0.006
#> 8    0.010
#> 9    0.611
#> 10   0.001

In practice, given the limited size of regional species pools and/or interest in a particular group of taxa only it is often more practical to supply a reduced list of taxa to elements::env_filter.

Shutting down

At the end of the analysis run elements::shutdown to close the connection to the filehash database.

elements::shutdown()