R/env_filter.R
env_filter.Rd
Retrieve the most suitable taxa for a given set of environmental variable values supplied in the 'predictors' argument. To sets of methods are available:
"svm" which generates predictions using elements::predict_occ
and uses the resultant probability values;
"mean" and "median" which calculates the scaled euclidean distance between the values supplied in the 'predictors' argument and the mean or median niche positions as present in elements::NicheWidths
.
env_filter(
predictors = elements::ExampleScenarios[1, ],
taxa = elements::TaxonomicBackbone$taxon_code,
method = "svm",
limit = NULL,
exclude = NULL,
threshold = NULL,
toptaxa = NULL
)
A data frame of predictors. Must include atleast one the following columns: L, M, N, R, S, SD, GP, bio05, bio06, bio16, bio17. Columns not included must then be included in the 'exclude' argument.
A vector of strings containing one or more taxa to generate predictions for.
One of "svm", "mean", or "median".
A string representing the niche width quantiles, one of "min_max", "q01_q99", "q05_q95", "q25_q75". Which if set assigns a probability of 0 to a set of predictors if one or more of those predictors are outside the stipulated quantile ranges. Only applied if pa = "Present". Optional.
Model variables to exclude from the distance calculation; passed to the 'holdout' argument of 'elements::predict_occ' if the 'method' argument is "svm", otherwise when the 'method' argument is set to "mean" or "median" those variables are removed from the distance calculation.
A probability threshold to use as a cut off in the environmental filter. Only applicable when 'method' = "svm".
The number of top taxa, as defined by their probability when 'method' = "svm" or euclidean distance when when 'method' = "mean" or "median".
A dataframe containing three columns: taxon_code, rank, and Present (if 'method' = "svm") or Distance (if 'method' = "mean" or "median").
The svm method will produce more accurate results as it considers the position of the environmental variable values in the 11-dimensional hypervolume; however, if there are a large number of taxa-predictor combinations the mean and median methods offer a faster alternative.
NOTE: The "mean" and "median" methods do not produce realistic results and so are currently included for demonstrative purposes only.
elements::startup(); elements::env_filter(predictors = elements::ExampleScenarios[1,], taxa = elements::TaxonomicBackbone$taxon_code, method = "svm", threshold = 0.5)
#> Error in elements::predict_occ(taxa_codes = taxa, predictors = predictors, limit = limit, holdopt = exclude, append_predictors = FALSE): All model variables (L, M, N, R, S, SD, GP, bio05, bio06, bio16, bio17) must either be present in the predictors data frame or passed to holdopt.