Theoretical background
Tang [2018] proposed a sequential regression approach based on factorization of the joint distribution of longitudinal outcome: \[\label{eq:factorize} f(y_{i1},...,y_{iJ}) \quad = \quad \prod_{j=1}^{s_i} f(y_{ij}|\mathbf{z}_{ij},\mathbf{\theta}_j) \prod_{j=s_i+1}^{J} g(y_{ij}|\mathbf{z}_{ij},\mathbf{\theta}_j, \tilde{\mathbf{\theta}}_j),\] where \(\mathbf{y}_i = (y_{i1},..., y_{iJ})'\) be the outcomes of participant \(i (i=1,...,n)\) having \(J\) \((j=1,...,J)\) post-baseline visits, and \(\mathbf{z}_i = (\mathbf{x}'_i, \mathbf{y}'_{i,j-1} )\), \(\mathbf{z}_{i1} =\mathbf{x}_i = (x_{i1},..., x_{iP})'\) the covariates for participant \(i\). We set \(x_{i1}= 1\) for the intercept and \(x_{iP} = g_i\) (\(g_i=1\) for treated and 0 for control/placebo in a two-arm trial) for the treatment status. In general, \(\mathbf{y}_i\) are partially observed. Let \(s_i\) be the dropout pattern according to the time of the last observation. We have \(s_i = 0\) for participants with no post-baseline assessment, and \(s_i = J\) for completers. Thus, \(f(y_{ij}|\mathbf{z}_{ij},\theta_j)\) and \(g(y_{ij}|\mathbf{z}_{ij},\theta_j, \tilde{\theta}_j)\) are respectively the distribution of the observed and missing outcomes conditioning on the history.
In the sequential regression approach, the proportional odds model is used for ordinal outcomes with \(K\) levels \[\label{eq:clm} Pr(y_{ij} \leq k|\mathbf{z}_{ij}, \mathbf{\theta}_j) \quad = \quad expit\left(c_{j_k} + \mathbf{\alpha}'_j\mathbf{x}_i + \sum_{t=1}^{j-1}\beta_{jt}y_{it} \right),\] for \(k=1,...,K-1\), where \(\mathbf{\alpha}_j = (\alpha_{j1},...,\alpha_{jP})'\), \(c_{j_1} < ... <c_{j_{K-1}}\), and \(expit(\eta) = \exp(\eta)/[1+\exp(\eta)]\). To ensure \(c_{j_k} > c_{j_{k-1}}\), the reparameterization \(d_{j_k} = \log(c_{j_k}-c_{j_{k-1}})\) is applied, and assign normal prior for \(d_{j_k}\): \[c_{j_1}, d_{j_1}, ...,d_{j_{K-1}} \sim N(\mu_c, \sigma_c^2),\]
\[c_{j_k} \sim c_{j_{k-1}} + \exp(d_{j_{k-1}}), k=2,...,K.\]
For the binary endpoint, the model reduces to \[\label{eq:bin} Pr(y_{ij} = 1|\mathbf{z}_{ij}, \mathbf{\theta}_j) \quad = \quad expit\left(\mathbf{\alpha}'_j\mathbf{x}_i + \sum_{t=1}^{j-1}\beta_{jt}y_{it} \right),\]
multiple imputation (MI) under missing-at-random (MAR) assumption
Imputation with MAR assumption is implemented through two algorithmic backends:
JAGS: It’s a gibbs sampler under fully Bayesian setting. Under MAR, \(g(y_{ij}|\mathbf{z}_{ij},\mathbf{\theta}_j, \tilde{\mathbf{\theta}}_j)\) is identical to \(f(y_{ij}|\mathbf{z}_{ij},\mathbf{\theta}_j)\). Markow chain Monte Carlo (MCMC) with data augmentation (DA) for missing values is realized through jags . Prior distributions have to be specified for all (hyper)parameters. A common prior choice for the regression coefficients is the normal distribution with mean zero and large variance. In remiod, variance parameters are specified as, by default, inverse-gamma distributions.
Algorithm proposed by Tang[2018]: The algorithm differentiates intermittent missing data and missingness after dropout. Intermittent missing data is imputed first to make missing pattern become monotone. proposed a Matroplis-Hasting sampler for the monotone data augmentation (MDA) step. The missing data after dropout can be imputed sequentially given the draw of the model parameters and imputed intermittent missing data after the MDA algorithm converges.
Controlled MI under missing-not-at-random (MNAR)
(1) delta-adjusted pattern mixture model (PMM)
Let \(\delta\) be the sensitivity parameter. Then the adjustment is placed onto the parameter corresponding to treatment variable \(g_i\): \[\label{eq:clmdelta} Pr(y_{ij} \leq k|\mathbf{z}_{ij}, \mathbf{\theta}_j) \quad = \quad expit\left[c_{j_k} + \sum_{p=1}^{P-1} \alpha_{jp} x_{ip} + (\alpha_{jP} + \delta) g_i + \sum_{t=1}^{j-1}\beta_{jt}y_{it} \right],\] The equation implies MAR in the control group, while the log odds of being in better/worse health status after dropout are reduced compared to those who remain in the trial (\(\delta < 0\) if lower scores on \(y_{ij}\)’s indicate better health. Otherwise, \(\delta > 0\)). In the model, it is impossible to estimate \(\delta\) from the observed data. Thus, the method of tipping point analysis is often used, that is, we assume \(\delta\) is known and repeat the MI inference at a sequence of increasing \(\delta\) values to find the tipping point \(\delta\), at which the treatment effect becomes insignificant. The MAR-based analysis is said to be robust if the tipping point is large and deemed clinically implausible.
(2) Copy reference PMM
Copy reference (CR) procedure can be presented as follows: \[\label{eq:clmcr} Pr(y_{ij} \leq k|\mathbf{z}_{ij}, \mathbf{\theta}_j) \quad = \quad expit\left[c_{j_k} + \sum_{p=1}^{P-1} \alpha_{jp} x_{ip} + \sum_{t=1}^{j-1}\beta_{jt}y_{it} \right], \forall j > s_i\] Clearly, under CR, treated participants are assumed to gain no benefit even though the treatment was taken, thus having the same mean response profiles as the reference (i.e.control) participants both before and after dropout .
(3) Jump to reference PMM
Jump to reference (J2R) assumes that all treatment benefits are gone immediately after participants discontinue the experimental treatment. The unconditional treatment effect at visit \(j (j>s_i)\) that is unadjusted for \((y_{i1}, … , y_{i,j-1})\) needs to be computed for imputing missing values after dropout [Tang 2018].