A Bayesian multivariate factor analysis model for evaluating an intervention by using observational time series data on multiple outcomes

Summary A problem that is frequently encountered in many areas of scientific research is that of estimating the effect of a non-randomized binary intervention on an outcome of interest by using time series data on units that received the intervention (‘treated’) and units that did not (‘controls’). One popular estimation method in this setting is based on the factor analysis (FA) model. The FA model is fitted to the preintervention outcome data on treated units and all the outcome data on control units, and the counterfactual treatment-free post-intervention outcomes of the former are predicted from the fitted model. Intervention effects are estimated as the observed outcomes minus these predicted counterfactual outcomes. We propose a model that extends the FA model for estimating intervention effects by jointly modelling the multiple outcomes to exploit shared variability, and assuming an auto-regressive structure on factors to account for temporal correlations in the outcome. Using simulation studies, we show that the method proposed can improve the precision of the intervention effect estimates and achieve better control of the type I error rate (compared with the FA model), especially when either the number of preintervention measurements or the number of control units is small. We apply our method to estimate the effect of stricter alcohol licensing policies on alcohol-related harms.

Algorithm 1: Hybrid Gibbs sampler for the multivariate factor analysis model with an AR(1) prior on the factors and a multiplicative Gamma process shrinkage prior on the loadings.

A.1 Update of regression coefficients
Let P be the total number of covariates. For each outcome k and time t, let X tk denote the n × P matrix with rows x it . We have that: π (β k | rest) ∝ N β k ; 0, 10 3 I T t=1 N (y tk ; Γ k s tk + Λf tk + X tk β k , Ψ k ) whereỹ tk = y tk − Γ k s tk − Λf tk . Hence, we draw β k from a multivariate normal distribution with variance-covariance J k = T t=1 X tk Ψ −1 k X tk + 10 −3 I −1 and mean h k = J k T t=1 X tk Ψ −1 kỹ tk . For the remainder of this section, we will omit the covariates in order to ease the notation. However, the updates that we describe can be used for the MVFA model with covariates. In particular, they can be used as shown by replacing y itk with y itk − x it β k .

A.2 Update of factors
For each outcome k, we define F k as the T × (p 1 + p 2 ) matrix with elements (1) Let f r tk be the t-th row of F k (t = 1, . . . , T ) and f c jk be the j-th column of F k (j = 1, . . . , p). For each j, we have assumed that f c jk arises from an AR(1) process with persistent parameter ρ jk . Therefore, for each j we a priori have that f c jk | ρ jk ∼ N T 0, Q −1 jk where 0 is a T -vector of zeroes and see e.g. Kastner and Frühwirth-Schnatter (2014).

A.3 Update of persistent parameters
Let f tjk be the j-th factor at time t and outcome k, where j = 1, . . . , p = p 1 + p 2 and f tjk = s tjk for j > p 1 . The initial values f 0jk can be drawn from their full conditional N (ρ jk f 1jk , 1) distribution 2 Rue (2001). (Kastner and Frühwirth-Schnatter, 2014). As noted by Kastner and Frühwirth-Schnatter (2014), the full conditional of the persistent parameter ρ jk is and accept this value with probability min 1, I ρ * jk ∈ (−1, 1)

A.4 Update of loadings
For each unit i we defineλ ik = γ ik , λ i , where γ ik and λ i are the i-th rows of Γ k and Λ, have used superscripts 0 and 1 to denote the MGPS shrinkage parameters of the outcome-specific and shared loadings, respectively. Note that in Φ ik we have divided the MGPS parameters of the shared loadings by K because their prior contributes to all K outcomes. For each outcome k, we have that where F k was defined is Section A.2 and Σ ik = ψ 2 ik I. We first write w ik = w 1 ik , w 2 ik and a p 1 × p 2 matrix and G 3 ik is a p 2 × p 2 matrix. Letλ i = (γ i1 , . . . , γ iK , λ i ). Equation 5 implies that that is,λ i a posteriori has a normal distribution with variance-covariance matrix G −1 i and mean We use Algorithm 2 to simulate from π(λ i | rest) and update the loadings.

A.5 Update of shrinkage parameters
As shown by Bhattacharya and Dunson (2011), the full conditional distributions of the MGPS prior parameters are available in closed form. Here, we list them for outcome-specific loadings. In particular, for each unit i, factor j ≤ p 1 and outcome k, we have that Moreover, for each outcome k we have that for factor j = 1 and for factors j, 2 ≤ j ≤ p 1 , that where ω j = t=1,t =j δ t . Updates for shared loadings are analogous and therefore are not shown.

A.6 Update of variances
The inverse Gamma prior is conjugate for the variance terms ψ 2 ik . In particular, for each unit i and outcome k, we have that

B Supplementary material for simulation studies
In this section, we provide supplementary results for the simulation study of Section 3. Table 1 presents bias, standard deviation, mean credible interval width and false positive rate for (k, t) = (2, T 1 + 1) and (k, t) = (3, T ). Figure 1 shows power for (k, t) = (2, T 1 + 1). Finally, Figures 1 shows power for (k, T ) = (3, T ).  I  II  III  IV  V  VI  VII  VIII  IX  T 1  40  40  40  20  20  20  10  10  10  n 1  30  15  5  30  15  5  30  15  5 Results for k = 2 and t =     Table 1: Results of the simulation study for (k, t) = (2, T 1 +1) and (k, t) = (3, T ). The table presents the bias of the point estimates of ϑ tk , the standard error of the point estimates, the mean width of the 95% credible intervals and the false positive rate. All results are based on 10000 simulated datasets from the MVFA+AR model. 8