Retrieving Models
Due to the total size of the 6631 ENMs currently included in
elements
(1.8GB when compressed, 7.5GB in memory) the ENMs
are not exported in a .rda object. Instead they are made available
through a filehash
(Peng, 2005) database, which provides
access to the ENMs without loading all models into memory. To access the
ENMs a connection to this database must be initialised using
elements::startup
. As mentioned above the Github repository
does not include the “./inst/extdata/Models” object containing all the
ENMs, the elements::startup
will check whether the
“./inst/extdata/Models” is present and if it is not found will load the
“./inst/testdata/TestModels” models instead. The models to load can also
be accessed by passing “all” or “test” to the ‘models’ argument of
elements::startup
.
elements::startup()
#> elements startup completed.
model <- elementsEnv$Models[["stellaria_graminea"]]
#>
#> Call:
#> svm(formula = Presence ~ L + M + N + R + S + SD + GP + bio05 + bio06 +
#> bio16 + bio17, data = data, type = "C-classification", probability = TRUE)
#>
#>
#> Parameters:
#> SVM-Type: C-classification
#> SVM-Kernel: radial
#> cost: 0.6
#>
#> Number of Support Vectors: 12724
Generating Predictions
The raw ENMs retrieved using the method above can be used as regular
e1071
SVM model objects. Alternatively, the helper function
elements::predict_occ_taxon
retrieves a model using the
method above, generates predictions, and formats the results as a data
frame.
results <- elements::predict_occ_taxon(taxon = "stellaria_graminea", predictors = elements::ExampleData1, pa = "Present", limit = NULL, dp = 2, append = "ids")
#> Present
#> 1 0.30
#> 2 0.00
#> 3 0.00
#> 4 0.70
#> 5 0.01
#> 6 0.08
An additional helper function elements::predict_occ
can
generate predictions for multiple taxa, by either specifing the taxa to
model in the ‘taxa’ argument, or by setting ‘taxa’ to NULL and including
an additional column in the predictors data frame named
‘taxon_code’.
results <- elements::predict_occ(taxa = NULL, predictors = elements::ExampleData2, pa = "Present", limit = NULL, holdopt = NULL, dp = 2, append = "ids")
#> taxon_code Present
#> 201 silene_flos-cuculi 0.70
#> 202 silene_flos-cuculi 0.00
#> 203 silene_flos-cuculi 0.12
#> 204 silene_flos-cuculi 0.00
#> 205 silene_flos-cuculi 0.00
#> 206 silene_flos-cuculi 0.00
Two helper arguments provide additional functionality in controlling
model use. First, is the ‘limit’ argument, which assigns probability
values of zero if one or more predictors are outside a specified range
e.g. the 10% and 90% quantiles (see elements::NicheWidths
).
Second, is the ‘holdopt’ argument, which holds specified variable values
at their optima (as defined by the mean value present in
elements::NicheWidths
).
As a simple demonstration, below two sets of predictions for Stellaria graminea are generated, holding all variables apart from N at their optima: 1) with no limit set, and 2) with a limit set to the 1% and 99% quantiles.
n_gradient <- data.frame("N" = seq(0, 10, 0.01))
vary_N_no_limit <- elements::predict_occ_taxon(taxon = "stellaria_graminea", predictors = n_gradient,
pa = "Present", limit = NULL, holdopt = c("bio05", "bio06", "bio16", "bio17", "GP", "L", "M", "R", "S", "SD"),
dp = 2, append = "predictors")
vary_N_q01_q99 <- elements::predict_occ_taxon(taxon = "stellaria_graminea", predictors = n_gradient,
pa = "Present", limit = "q01_q99", holdopt = c("bio05", "bio06", "bio16", "bio17", "GP", "L", "M", "R", "S", "SD"),
dp = 2, append = "predictors")
Please note that as ten out of the eleven variables are held at their
optima the predicted probabilities will be high as the influence of
unsuitable N values will be partially offset. Consequently, the shape of
the response curves above will be wider than the corresponding PDP plot
produced with the elements::plot_me
function (see the
Model Inspection).
Environmental filtering
elements
can also be used to filter species pools based
on a given set of predictor values using the function
elements::env_filter
. Two sets of methods are available: 1)
“svm” which generates predictions using
elements::predict_occ
and uses the resultant probability
values; and 2) “mean” and “median” which calculates the normalised
euclidean distance between the values supplied in the ‘predictors’
argument and the mean or median niche positions as present in
elements::NicheWidths
.
NOTE: The mean and median methods are only included for demonstrative purposes only and should not be used in practice as they do not consider the joint distribution of variables as expressed through the SVM model hypervolumes.
The option to apply the elements::envelope_filter
function, which first screens the supplied taxa to check whether the
predictor values are within a given range as supplied to the ‘limit’
argument (“min_max”, “q01_q99”, “q05_q95”, “q10_q90”, “q25_q75”), is
controlled by the ‘screen’ argument; elements::env_filter
will then only run the more computationally expensive
elements::predict_occ
or
elements::calc_distance
functions for the taxa within the
limits. This is highly recommended as it usually results in a 95%
reduction in the number of taxa being supplied to
elements::predict_occ
or
elements::calc_distance
, greatly improving performance.
For example, below elements::env_filter
is applied to
all taxa in elements::TaxonomicBackbone
using the svm
method, with the predictors derived from
elements::ExampleScenarios
.
filter_results <- elements::env_filter(predictors = elements::ExampleScenarios[1,],
taxa = elements::TaxonomicBackbone$taxon_code,
screen = TRUE, method = "svm", limit = "min_max",
exclude = NULL, threshold = NULL,
append = "ids")
#> scenario timeslice scenario_code taxon_code
#> 1 Baseline 2007 a abies_alba
#> 2 Baseline 2007 a abietinella_abietina
#> 3 Baseline 2007 a acer_campestre
#> 4 Baseline 2007 a acer_negundo
#> 5 Baseline 2007 a acer_platanoides
#> 6 Baseline 2007 a acer_pseudoplatanus
#> 7 Baseline 2007 a achillea_millefolium
#> 8 Baseline 2007 a achillea_millefolium_subsp._millefolium
#> 9 Baseline 2007 a achillea_ptarmica
#> 10 Baseline 2007 a aconitum_napellus
#> Present
#> 1 0.004
#> 2 0.000
#> 3 0.001
#> 4 0.000
#> 5 0.000
#> 6 0.010
#> 7 0.006
#> 8 0.010
#> 9 0.611
#> 10 0.001
In practice, given the limited size of regional species pools and/or
interest in a particular group of taxa only it is often more practical
to supply a reduced list of taxa to
elements::env_filter
.
Shutting down
At the end of the analysis run elements::shutdown
to
close the connection to the filehash database.
elements::shutdown()