Reference-Based Multiple Imputation for Ordinal/Binary Response

remiod(formula, data, trtvar, refcats = NULL, family = NULL,
  method = "MAR", delta = 0, algorithm = c("tang_seq", "jags"),
  rinv = 1e-04, scheme = 2, model_order = NULL, models = NULL,
  ord_cov_dummy = TRUE, n.chains = 2, n.adapt = 100, n.iter = 1000,
  thin = 2, start = NULL, end = NULL, seed = 1234,
  exclude_chains = NULL, subset = NULL, include = FALSE, mess = TRUE,
  warn = FALSE, progress.bar = TRUE, ...)

Arguments

formula

a two sided model formula (see formula) or a list of such formulas; (more details below).

data

a data.frame containing the original data (more details below)

trtvar

the name of treatment variable. When necessary, its reference category, i.e. control arm, can be set in refcats argument.

refcats

optional; either one of "first", "last", "largest" (which sets the category for all categorical variables) or a named list specifying which category should be used as reference category per categorical variable. Options are the category label, the category number, or one of "first" (the first category), "last" (the last category) or "largest" (chooses the category with the most observations). Default is "first". If reference categories are specified for a subset of the categorical variables the default will be used for the remaining variables. (See also set_refcat)

family

only for glm_imp and glmm_imp/glmer_imp: a description of the distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (For more details see below and family.)

method

a method for obtaining multiple-imputed dataset. Options include MAR, J2R, CR, and delta adjustment. Default is MAR.

delta

specific value used for Delta adjustment, applicable only for method="delta".

algorithm

either algorithm tang_seq proposed by Tang (2018) or jags the original method inherited in JAGS (Plummer 2003).

rinv

a small number used to adjusting Fish information matrix

scheme

scheme of distribution used for proposing coefficients of imputation models. scheme=1: beta ~ N( mean + inv(I)*score, inv(I)); scheme=2: beta ~ N( mean , inv(I)).

model_order

optional. manually specify an order for imputation models.

models

optional; named vector specifying the types of models for (incomplete) covariates. This arguments replaces the argument meth used in earlier versions. If NULL (default) models will be determined automatically based on the class of the respective columns of data.

ord_cov_dummy

optional. specify whether ordinal variables should be treated as categorical variables or continuous variables when they are included as covariates in the sequential imputation models. Default is TRUE, dummy variables will be created accordingly.

n.chains

number of MCMC chains

n.adapt

number of iterations for adaptation of the MCMC samplers (see adapt)

n.iter

number of iterations of the MCMC chain (after adaptation; see coda.samples)

thin

thinning interval (integer; see window.mcmc). For example, thin = 1 (default) will keep the MCMC samples from all iterations; thin = 5 would only keep every 5th iteration.

start

the first iteration of interest (see window.mcmc)

end

the last iteration of interest (see window.mcmc)

seed

optional; seed value (for reproducibility)

exclude_chains

optional vector of the index numbers of chains that should be excluded

subset

subset of parameters/variables/nodes (columns in the MCMC sample). Follows the same principle as the argument monitor_params and selected_parms.

include

logical, if TRUE, raw data will be included in imputed data sets with imputation ID = 0.

mess

logical; should messages be given? Default is TRUE.

warn

logical; should warnings be given? Default is TRUE.

progress.bar

character string specifying the type of progress bar. Possible values are "text" (default), "gui", and "none" (see update). Note: when sampling is performed in parallel it is not possible to display a progress bar.

...

additional, optional arguments

trunc: named list specifying limits of truncation for the distribution of the named incomplete variables (see the vignette ModelSpecification)
hyperpars: list of hyper-parameters, as obtained by default_hyperpars()
scale_vars: named vector of (continuous) variables that will be centred and scaled (such that mean = 0 and sd = 1) when they enter a linear predictor to improve convergence of the MCMC sampling. Default is that all numeric variables and integer variables with >20 different values will be scaled. If set to FALSE no scaling will be done.
custom: named list of JAGS model chunks (character strings) that replace the model for the given variable.
append_data_list: list that will be appended to the list containing the data that is passed to rjags (data_list). This may be necessary if additional data / variables are needed for custom (covariate) models.
progress.bar: character string specifying the type of progress bar. Possible values are "text" (default), "gui", and "none" (see update). Note: when sampling is performed in parallel it is not possible to display a progress bar.
quiet: logical; if TRUE then messages generated by rjags during compilation as well as the progress bar for the adaptive phase will be suppressed, (see jags.model)
keep_scaled_mcmc: should the "original" MCMC sample (i.e., the scaled version returned by coda.samples()) be kept? (The MCMC sample that is re-scaled to the scale of the data is always kept.)
modelname: character string specifying the name of the model file (including the ending, either .R or .txt). If unspecified a random name will be generated.
modeldir: directory containing the model file or directory in which the model file should be written. If unspecified a temporary directory will be created.
overwrite: logical; whether an existing model file with the specified <modeldir>/<modelname> should be overwritten. If set to FALSE and a model already exists, that model will be used. If unspecified (NULL) and a file exists, the user is asked for input on how to proceed.
keep_model: logical; whether the created JAGS model file should be saved or removed from (FALSE; default) when the sampling has finished.

Value

A list includes (1) Information from JAGS modeling and MCMC samples and (2) A data.frame in which the original data (if include = TRUE) and the imputed datasets are stacked onto each other.
The variable Imputation_ indexes the imputation, while .rownr links the rows to the rows of the original data. In cross-sectional datasets the variable .id is added as subject identifier.

Examples


# \donttest{
data(schizow)

test = remiod(formula = y6 ~ tx + y0 + y1 + y3, data = schizow,
              trtvar = 'tx', algorithm = 'jags', method="MAR",
              ord_cov_dummy = FALSE, n.adapt = 10, n.chains = 1,
              n.iter = 10, thin = 2, warn = FALSE, seed = 1234)
#> NOTE: Stopping adaptation
#> 
#> 
# }