Fit jsdm models in Stan
stan_jsdm.Rd
This function fits joint Species Distribution models in Stan, using either a
generalised linear latent variable model approach (method = "gllvm"
), or a
multivariate generalised linear mixed model approach (method = "mglmm"
).
Usage
stan_jsdm(X, ...)
# Default S3 method
stan_jsdm(
X = NULL,
Y = NULL,
species_intercept = TRUE,
method,
dat_list = NULL,
family,
site_intercept = "none",
D = NULL,
prior = jsdm_prior(),
site_groups = NULL,
beta_param = "unstruct",
Ntrials = NULL,
zi_param = "constant",
zi_X = NULL,
save_data = TRUE,
iter = 4000,
log_lik = TRUE,
...
)
# S3 method for class 'formula'
stan_jsdm(formula, data = list(), ...)
Arguments
- X
The covariates matrix, with rows being site and columns being covariates. Ignored in favour of data when formula approach is used to specify model.
- ...
Arguments passed to
rstan::sampling()
- Y
Matrix of species by sites. Rows are assumed to be sites, columns are assumed to be species
- species_intercept
Whether the model should be fit with an intercept by species, by default
TRUE
- method
Whether to fit a GLLVM or MGLMM model, details in description
- dat_list
Alternatively, data can be given to the model as a list containing Y, X, N, S, K, and site_intercept. See output of
jsdm_sim_data()
for an example of how this can be formatted.- family
is the response family, must be one of
"gaussian"
,"neg_binomial"
,"poisson"
,"binomial"
,"bernoulli"
, or"zi_poisson"
. Regular expression matching is supported.- site_intercept
Whether a site intercept should be included, potential values
"none"
(no site intercept),"grouped"
(a site intercept with hierarchical grouping) or"ungrouped"
(site intercept with no grouping)- D
The number of latent variables within a GLLVM model
- prior
Set of prior specifications from call to
jsdm_prior()
- site_groups
If the site intercept is grouped, a vector of group identities per site
- beta_param
The parameterisation of the environmental covariate effects, by default
"unstruct"
. See details for further information.- Ntrials
For the binomial distribution the number of trials, given as either a single integer which is assumed to be constant across sites or as a site-length vector of integers.
- zi_param
For the zero-inflated families, whether the zero-inflation parameter is a species-specific constant (default,
"constant"
), or varies by environmental covariates ("covariate"
).- zi_X
If
zi = "covariate"
, the matrix of environmental predictors that the zero-inflation is modelled in response to. If there is not already an intercept column (identified by all values being equal to one), one will be added to the front of the matrix.- save_data
If the data used to fit the model should be saved in the model object, by default TRUE.
- iter
A positive integer specifying the number of iterations for each chain, default 4000.
- log_lik
Whether the log likelihood should be calculated in the generated quantities (by default TRUE), required for loo
- formula
The formula of covariates that the species means are modelled from
- data
Dataframe or list of covariates.
Value
A jsdmStanFit
object, comprising a list including the StanFit
object, the data used to fit the model plus a few other bits of information. See
jsdmStanFit for details.
Details
Environmental covariate effects ("betas"
) can be parameterised in two
ways. With the "cor"
parameterisation all covariate effects are assumed
to be constrained by a correlation matrix between the covariates. With the
"unstruct"
parameterisation all covariate effects are assumed to draw
from a simple distribution with no correlation structure. Both parameterisations
can be modified using the prior object.
Families supported are the Gaussian family, the negative binomial family,
the Poisson family, the binomial family (with number of trials specificied
using the Ntrials
parameter), the Bernoulli family (the special case
of the binomial family where number of trials is equal to one), the
zero-inflated Poisson and the zero-inflated negative binomial. For both
zero-inflated families the zero-inflation is assumed to be a species-specific
constant.
Methods (by class)
stan_jsdm(default)
: this is the default way of doing thingsstan_jsdm(formula)
: Formula interface
Examples
if (FALSE) { # \dontrun{
# MGLMM - specified by using the mglmm aliases and with direct reference to Y and
# X matrices:
mglmm_data <- mglmm_sim_data(
N = 100, S = 10, K = 3,
family = "gaussian"
)
mglmm_fit <- stan_mglmm(
Y = mglmm_data$Y, X = mglmm_data$X,
family = "gaussian"
)
mglmm_fit
# You can also run a model by supplying the data as a list:
gllvm_data <- jsdm_sim_data(
method = "gllvm", N = 100, S = 6, D = 2,
family = "bernoulli"
)
gllvm_fit <- stan_jsdm(
dat_list = gllvm_data, method = "gllvm",
family = "bernoulli"
)
gllvm_fit
} # }