Reference-Based Multiple Imputation for Ordinal/Binary Response
remiod(formula, data, trtvar, refcats = NULL, family = NULL,
method = "MAR", delta = 0, algorithm = c("tang_seq", "jags"),
rinv = 1e-04, scheme = 2, model_order = NULL, models = NULL,
ord_cov_dummy = TRUE, n.chains = 2, n.adapt = 100, n.iter = 1000,
thin = 2, start = NULL, end = NULL, seed = 1234,
exclude_chains = NULL, subset = NULL, include = FALSE, mess = TRUE,
warn = FALSE, progress.bar = TRUE, ...)
a two sided model formula (see formula
)
or a list of such formulas; (more details below).
a data.frame
containing the original data
(more details below)
the name of treatment variable. When necessary, its reference category,
i.e. control arm, can be set in refcats
argument.
optional; either one of "first"
, "last"
,
"largest"
(which sets the category for all categorical
variables) or a named list specifying which category should
be used as reference category per categorical variable.
Options are the category label, the category number,
or one of "first" (the first category),
"last" (the last category) or "largest" (chooses the category
with the most observations).
Default is "first". If reference categories are specified for
a subset of the categorical variables the default will be
used for the remaining variables.
(See also set_refcat
)
only for glm_imp
and glmm_imp
/glmer_imp
:
a description of the distribution and link function to
be used in the model. This can be a character string naming a
family function, a family function or the result of a call to
a family function. (For more details see below and
family
.)
a method for obtaining multiple-imputed dataset. Options include
MAR
, J2R
, CR
, and delta
adjustment.
Default is MAR.
specific value used for Delta adjustment, applicable only for method="delta".
either algorithm tang_seq
proposed by Tang (2018) or
jags
the original method inherited in JAGS (Plummer 2003).
a small number used to adjusting Fish information matrix
scheme of distribution used for proposing coefficients of imputation models. scheme=1: beta ~ N( mean + inv(I)*score, inv(I)); scheme=2: beta ~ N( mean , inv(I)).
optional. manually specify an order for imputation models.
optional; named vector specifying the types of models for
(incomplete) covariates.
This arguments replaces the argument meth
used in
earlier versions.
If NULL
(default) models will be determined
automatically based on the class of the respective columns of
data
.
optional. specify whether ordinal variables should be treated as
categorical variables or continuous variables when they are
included as covariates in the sequential imputation models.
Default is TRUE
, dummy variables will be created accordingly.
number of MCMC chains
number of iterations for adaptation of the MCMC samplers
(see adapt
)
number of iterations of the MCMC chain (after adaptation;
see coda.samples
)
thinning interval (integer; see window.mcmc
).
For example, thin = 1
(default) will keep the MCMC samples
from all iterations; thin = 5
would only keep every 5th
iteration.
the first iteration of interest
(see window.mcmc
)
the last iteration of interest
(see window.mcmc
)
optional; seed value (for reproducibility)
optional vector of the index numbers of chains that should be excluded
subset of parameters/variables/nodes (columns in the MCMC
sample). Follows the same principle as the argument
monitor_params
and selected_parms
.
logical, if TRUE, raw data will be included in imputed data sets with imputation ID = 0.
logical; should messages be given? Default is
TRUE
.
logical; should warnings be given? Default is
TRUE
.
character string specifying the type of
progress bar. Possible values are "text" (default), "gui",
and "none" (see update
). Note: when
sampling is performed in parallel it is not possible to
display a progress bar.
additional, optional arguments
trunc
named list specifying limits of truncation for the distribution of the named incomplete variables (see the vignette ModelSpecification)
hyperpars
list of hyper-parameters, as obtained by
default_hyperpars()
scale_vars
named vector of (continuous) variables that
will be centred and scaled (such that mean = 0 and sd = 1)
when they enter a linear predictor to improve
convergence of the MCMC sampling. Default is that all
numeric variables and integer variables with >20 different
values will be scaled.
If set to FALSE
no scaling will be done.
custom
named list of JAGS model chunks (character strings) that replace the model for the given variable.
append_data_list
list that will be appended to the list
containing the data that is passed to rjags
(data_list
). This may be necessary if additional data /
variables are needed for custom (covariate) models.
progress.bar
character string specifying the type of
progress bar. Possible values are "text" (default), "gui",
and "none" (see update
). Note: when
sampling is performed in parallel it is not possible to
display a progress bar.
quiet
logical; if TRUE
then messages generated by
rjags during compilation as well as the progress bar
for the adaptive phase will be suppressed,
(see jags.model
)
keep_scaled_mcmc
should the "original" MCMC sample (i.e.,
the scaled version returned by coda.samples()
) be
kept? (The MCMC sample that is re-scaled to the scale of the
data is always kept.)
modelname
character string specifying the name of the model file (including the ending, either .R or .txt). If unspecified a random name will be generated.
modeldir
directory containing the model file or directory in which the model file should be written. If unspecified a temporary directory will be created.
overwrite
logical; whether an existing model file with
the specified <modeldir>/<modelname>
should be
overwritten. If set to FALSE
and a model already
exists, that model will be used. If unspecified (NULL
)
and a file exists, the user is asked for input on how to
proceed.
keep_model
logical; whether the created JAGS model file
should be saved or removed from (FALSE
; default) when
the sampling has finished.
A list includes (1) Information from JAGS modeling and MCMC samples
and (2) A data.frame
in which the original data (if
include = TRUE
) and the imputed datasets are stacked onto
each other.
The variable Imputation_
indexes the imputation, while
.rownr
links the rows to the rows of the original data.
In cross-sectional datasets the
variable .id
is added as subject identifier.
# \donttest{
data(schizow)
test = remiod(formula = y6 ~ tx + y0 + y1 + y3, data = schizow,
trtvar = 'tx', algorithm = 'jags', method="MAR",
ord_cov_dummy = FALSE, n.adapt = 10, n.chains = 1,
n.iter = 10, thin = 2, warn = FALSE, seed = 1234)
#> NOTE: Stopping adaptation
#>
#>
# }