State space models for non-stationary intermittently coupled systems: an application to the North Atlantic Oscillation

We develop Bayesian state space methods for modelling changes to the mean level or temporal correlation structure of an observed time series due to intermittent coupling with an unobserved process. Novel intervention methods are proposed to model the effect of repeated coupling as a single dynamic process. Latent time-varying autoregressive components are developed to model changes in the temporal correlation structure. Efficient filtering and smoothing methods are derived for the resulting class of models. We propose methods for quantifying the component of variance attributable to an unobserved process, the effect during individual coupling events, and the potential for skilful forecasts. The proposed methodology is applied to the study of winter-time variability in the dominant pattern of climate variation in the northern hemisphere, the North Atlantic Oscillation. Around 70% of the inter-annual variance in the winter (Dec-Jan-Feb) mean level is attributable to an unobserved process. Skilful forecasts for winter (Dec-Jan-Feb) mean are possible from the beginning of December.


Introduction
Intermittently coupled systems can be found in many areas of both the natural and the social sciences. We define an intermittently coupled system as a system which can be modelled by two or more component processes which interact only at certain times. For example, many climate processes are active only during certain times of year; and sea ice and snow cover change the interaction between the surface and the atmosphere (Chapin et al., 2010;Bourassa et al., 2013). Migrating birds and animals mix only at certain times of year, allowing transmission of disease between populations (Olsen et al., 2006;Altizer et al., 2011). Empirical models have been applied to forecasting intermittent demand in production economics and operational research (Croston, 1972;Shenstone and Hyndman, 2005). Interest will often focus on one component of the system, whereas the others may be impossible or impractical to observe or even to identify physically. However, physical reasoning or prior knowledge may support the existence of such components and provide information about their behaviour and their effect on the component of interest. We refer to these secondary processes as intermittently coupled components, and the times at which the processes interact as coupling events. By incorporating this information through careful statistical modelling we can separate the effect of intermittently coupled components from the underlying behaviour of the observed system.
The methodology that is developed in this study was motivated by the problem of diagnosing unusual persistence in the dominant mode of climate variability in the northern hemisphere, known as the North Atlantic oscillation (NAO). Because of its effect on European climate, the ability to forecast the NAO is currently a topic of great interest for the development of new climate prediction services (Siegert et al., 2016). A daily time series of NAO observations is shown in Fig. 1. There is a clear annual cycle in the observations. Fig. 1 also indicates that the NAO exhibits greater interannual variability in the extended winter season (December-March) than the extended summer season (April-November). At the same time, the auto-correlation function indicates increased persistence of day-to-day conditions between December and March than between April and November. Increased persistence implies increased predictability. The seasonal contrast in interannual variability and auto-correlation that is visible in Fig. 1 could be caused by a transient shift in the mean, or a change in auto-correlation structure during between December and March. Climate scientists typically fit separate models to different seasons (e.g. Keeley et al. (2009) and Franzke and Woollings (2011)). This approach makes it difficult to diagnose whether the apparent change in auto-correlation is the cause of the increased interannual variability, or a symptom of it.
In this study we propose a flexible class of models that are capable of separating variability due to unobservable intermittent components from long-term variability in the observed process itself, accumulated short-term variability and observation errors. We develop tools for diagnosing whether the intermittent component acts on the mean or the auto-correlation structure of the observed system. If we can learn the state of the intermittent component sufficiently quickly, then it should be possible to make skilful predictions about the remainder of a particular coupled period. Alternatively, the effect of the intermittent component may be similar between coupling events. In that case, it should be possible to make predictions about subsequent coupled periods.
State space models, which are also known as structural time series models, provide a flexible class of models for non-stationary time series (Durbin and Koopman, 2012). By modelling the system in terms of physically meaningful components we can incorporate expert knowledge to help to separate the effects of intermittent components from long-term variability elsewhere in the system. There is an extensive literature on modelling non-stationarity in the mean by state space methods, particularly where the observed process depends linearly on the state parameters and the observation and evolution processes are both normally distributed (Harvey, 1989;West and Harrison, 1997;Durbin and Koopman, 2012). Time varying auto-regressive (TVAR) models generalize classical auto-regressive models to have time varying coefficients, thus capturing changes in the auto-correlation structure (Subba Rao, 1970;Kitagawa and Gersch, 1985;West, 1997, 2010). In Section 3.1, we propose a class of models containing latent TVAR components that capture changes in short-term temporal dependence while maintaining the interpretability of the mean and unobserved intermittent effects.
Smooth changes in the mean or the temporal dependence structure can be captured by simple random-walk priors on their respective state variables. Rapid changes, such as those that might be expected due to intermittent coupling, often require explicit interventions in the model (Box and Tiao, 1975). Intervention methods were extended to state space models by Harvey and Durbin (1986). Standard intervention approaches (e.g. Harvey (1989), chapter 7.6, Harrison (1997), chapter 11, andKoopman (2012), chapter 3.2.5) require the introduction of separate intervention and effect variables for each event. The effect is usually assumed to be constant throughout a particular event and independent between events. In the case of intermittent coupling, the underlying cause of each event will usually be the same, although the effect may vary. In Section 3.2, we model the effect of intermittent coupling as a single dynamic process, intermittently identifiable through a series of interventions that determine the timing and duration of the coupling events.
The construction of the NAO time series that is shown in Fig. 1 and analysed in Section 6 is described in Section 2. Following the methodological developments that were outlined above, we discuss efficient posterior inference for the resulting class of models in Section 5. Section 6 contains the results of our study of the NAO. Section 7 concludes with a discussion.
The data that are analysed in the paper and the programs that were used to analyse them can be obtained from https://rss.onlinelibrary.wiley.com/hub/journal/14679876/seriesc-datasets

The North Atlantic oscillation
The NAO is the name that is given to the difference in surface pressure between the Azores high and the Icelandic low (Walker, 1924). The NAO is important because it affects the strength of the prevailing westerly winds and the position of the storm track, strongly influencing the winter climate of the UK and Europe (Hurrell, 1995). The NAO varies on timescales from a few days to several decades (Hurrell, 1995;Kushnir et al., 2006). Statistical studies have hinted at the potential to predict the NAO on seasonal timescales (Keeley et al., 2009;Franzke and Woollings, 2011). This potential predictability is often attributed to forcing by slowly varying components of the climate system, including sea surface temperatures, the stratosphere and snow cover (Kushnir et al., 2006). Climate models have recently begun to show significant skill in forecasting the winter NAO one season ahead . However, the physical mechanisms behind the predictability remain unclear and the size of the predictable signal appears to be underestimated by the models Eade et al., 2014). Careful statistical modelling may lead to additional insights. If a predictable signal can be extracted from the observations, then it may be possible to identify the source of the forcing effect.
Following Mosedale et al. (2006), we construct a simple NAO index as the area-weighted sea level pressure difference between two boxes, one stretching from 20 • to 55 • N, the other from 50 • to 90 • N, both spanning from 90 • W to 60 • E, using pressure data from the ERA-Interim data set (Dee et al., 2011). The resulting daily time series, shown in Fig. 1, spans the period from January 1st, 1979, to December 31st, 2017: a total of T = 14 245 observations.

Modelling intermittently coupled systems
In complex systems such as the Earth system, it is reasonable to consider that all components of the system (e.g. mean, seasonality and temporal dependence) may evolve slowly over time. We begin by outlining a general model to capture gradual changes in the underlying components of the observed process. We then propose explicit intervention models to represent rapid transient changes due to intermittent coupling.

Latent time varying auto-regressive component models
Classical auto-regressive models require that we redefine the mean of the observed process, if the mean is non-zero. This makes it difficult to specify physically meaningful models for the time evolution of the mean and the effect of intermittently coupled components. Latent auto-regressive components remove the need to redefine the mean level of the observed time series (Harvey (1989), chapter 2). To allow for possible changes in the mean, seasonal and autocorrelation structure of an observed process, we propose the following latent TVAR component model with observation equation  where ω = 2π=365:25. The observed process Y t is modelled as the sum of mean, seasonal and auto-regressive components. The variable μ t represents the mean level of the observed process. Any local-in-time systematic trend is captured by the variable β t . The harmonic components ψ kt and ψ Å kt (k = 1, : : : , K) represent seasonal behaviour. The local trend and seasonal variables are assumed to be time varying, evolving according to independent normal evolution processes w μt , w βt , w ψ k t and w ψ Å k t (k = 1, : : : , K). The irregular component X t represents short-term variability in the observed process and is modelled as a latent TVAR process of order P with normal evolution process w Xt . The auto-regressive coefficients φ pt are assumed to be time varying, evolving according to independent normal evolution processes w φ p t (p = 1, : : : , P). The independent residual v t represents observation or measurement error.
In the case of the NAO, the variance W Xt of the evolution process w Xt is expected to vary systematically with the solar cycle and is modelled as The other evolution and error variances W μ , W β , W ψ , W φ and V are assumed constant over time. Model (2) is intended to capture gradual changes in the structure of the observed process. Therefore, the evolution variances W μ , W β , W ψ and W φ are expected to be small, in particular W μ , W β , W ψ , W φ W X t . The evolution and error variances are assumed unknown and must be inferred from the data. The variance parameters W X , a and b of the irregular component in model (3) must also be inferred from the data. Expert judgement about the scale of the evolution variances can be incorporated through appropriate prior probability distributions. The model defined by expressions (1) and (2) is quite general and could be applied to a range of climate, economic or environmental time series. Examination of the sample periodogram for our NAO index showed clear evidence of 6-and 12-month cycles, suggesting a model with K = 2 harmonic components. Examination of the sample auto-correlation and partial autocorrelation functions suggest a latent TVAR process with P = 5 coefficients (after removing a linear trend, and constant annual and semiannual cycles estimated by least squares).

Intervention methods for intermittent coupling
The change in the auto-correlation structure of the NAO index in Fig. 1 appears to involve two distinct states, i.e. coupled or not. We model the change from the uncoupled to the coupled state by introducing an intervention variable where τ is the set of times t where the observed system is believed to be coupled to the unobserved process, e.g. τ = {December, January, February, March}. We assume that the timing and duration of the coupling events is constant between events but not known precisely. We model the intervention λ t by introducing two hyperparameters α and γ representing the start and duration of the coupled period τ respectively (Fig. 2). In practice, we do not expect an instantaneous change in the behaviour of the system. Therefore, we linearly taper the intervention λ t from 0 to 1 over a period γ 1 at the start of the coupled period and from 1 to 0 over a period γ 2 at the end of the coupled period. In the absence of stronger beliefs, we assume that the tapering is symmetric (i.e. γ 1 = γ 2 ) and accounts for a proportion ρ = .γ 1 + γ 2 /=γ of the duration γ. The hyperparameters α, γ and ρ are assumed to be unknown and must be inferred from the data. We consider two alternative models for the effect of intermittent coupling. First, coupling may lead to a transient change in the mean of the observed process; second, coupling may lead to a transient change in the temporal dependence structure of the observed process. If coupling Example of the form and parameterization of the intervention λ t : parameters α and γ represent the start and duration of the coupled period, whereas ρ D .γ 1 C γ 2 /=γ controls the transition is believed to induce a change in the mean, then the forecast equation (1) is modified to include the intervention as follows: The effect δ t is modelled as We interpret the effect δ t as a change in the mean level, since we expect the effect variance to be small, i.e. W δ W X t . However, when λ > 0, the day-to-day variability of the observed process Y t will increase slightly, in addition to any systematic change that is captured by W X t in expression (3). If coupling is believed to induce a change in the auto-correlation structure, then we modify the forecast equation (1) again, .6/ and we define P effects δ pt , modelled as δ pt = ϕδ p,t−1 + w δ pt w δ p t ∼ N.0, W δ /, p = 1, : : : , P, .7/ with common hyperparameters ϕ and W δ . Most of our prior knowledge about coupling events is likely to be about their timing and will be expressed through priors on the hyperparameters α, γ and ρ. Therefore, it is difficult to justify a complex form for the effects δ t or δ pt . However, a variety of behaviours can be captured depending on the values of the coefficient ϕ and variance W δ .
As noted in the previous section, the mean, trend, seasonal and auto-regressive parameters are expected to vary only slowly. Therefore, we can learn their states outside the coupled period and identify the coupling effects δ t or δ pt (p = 1, : : : , P) when λ t > 0. The form and parameterization of the coupling intervention λ t in Fig. 2 reflect our physical intuition about the likely influence of an unobserved process on the NAO. For other applications, different forms might be appropriate, e.g. no tapering, non-symmetric tapering or non-linear tapering. We recommend keeping 0 λ t 1, to make the coupling effect easily interpretable. The only other restriction is that the intervention should be transient, not permanent. Permanent changes can be modelled in the same way, but the effects should be fixed to be identifiable, i.e. ϕ = 1 and W δ = 0.

State space form and prior assessment
The model that was proposed in Section 3.1 can be written in state space form as The forecast function f.θ t , v t / is given by equation (1). The evolution function g.θ t−1 , w t / is given by expression (2). The evolution covariance matrix W is diagonal with main diagonal W t = .W μ , W β , W ψ , W ψ , : : : , W ψ , W ψ , W Xt , W φ , : : : , W φ / . The coupling effect δ t or effects δ pt (p = 1, : : : , P) can be appended to the state vector θ t . The evolution process vector w t and covariance matrix W can also be extended to include the coupling effect evolution process w δt or processes w δ pt (p = 1, : : : , P) and variance W δ respectively.
The prior distribution θ 0 ∼ N.m 0 , Σ 0 / specifies our beliefs about the state variables at time t = 0. We also need to specify priors for the collection of hyperparameters

Priors for the state variables
Independent normal priors were assigned to each component of the state vector θ at time t = 0. The prior means and variances are listed in Table 1. We use previous studies of the NAO to define informative priors for the mean μ 0 (Hsu and Wallace, 1976), seasonal components ψ 1,0 , ψ Å 1,0 , ψ 2,0 and ψ Å 2,0 (Chen et al., 2012) and TVAR coefficients φ 1,0 , : : : , φ 5,0 (Masala, 2015). The prior on the local trend β 0 is based on our judgement that the NAO mean is very unlikely to experience a local change that is equivalent to more than 1 hPa year −1 . The daily NAO in Fig. 1 has a range of approximately 40 hPa. Therefore, the TVAR residuals X −4 , : : : , X 0 were assigned independent normal priors with mean 0 hPa and standard deviation 10 hPa, based on a range of four standard deviations.

Priors on the hyperparameters
The prior distributions that were assigned to the hyperparameters V , W μ , W β , W φ and W X , a and b are listed in Table 2. In the case of the NAO, the variability in the mean and seasonal components will be driven primarily be solar forcing; therefore we assume equal error variances, i.e. W ψ = W μ . The observation and evolution variances V , W μ , W β and W φ are all expected to be very small, but non-zero. Therefore, boundary avoiding priors were specified in the form of normal distributions on the logarithm of each variance parameter. Simulation studies of the individual components in expression (2) were used to assign priors that reflect the range of variability that we consider plausible for each component. We expect the annual cycle in the day-to-day variance W Xt to peak during the winter season (December-January-February) with an amplitude of up to 5 hPa 2 . Corresponding uniform priors were assigned to the amplitude and phase of W Xt and transformed to approximate normal priors for a and b by simulation. Table 3 lists the priors for the intervention parameters α, γ and ρ and the coupling effect parameters ϕ and W δ . Our beliefs about the timing of the intervention λ t are the same regardless of whether coupling affects the mean or the auto-correlation structure. Vague triangular priors are specified for the beginning α and duration γ of the coupled period. These suggest a coupled  beta.45, 1/ . 0:9, 1:0/ coefficient period with total length around 180 days, beginning around November 1st. A mildly informative prior is specified for the tapering parameter ρ to reflect our physical reasoning that the influence of the unobserved process is unlikely to be constant throughout the coupled period. The coupling coefficients ϕ μ and ϕ X are expected to be positive and close to but not exceeding 1. The mean coupling effect variance W δ μ is expected to be greater than the mean variance W μ , but still small compared with W X t . Similarly, the auto-correlation coupling effect variance W δ X is expected to be greater than the coefficient evolution variance W φ .

Posterior inference
We want to evaluate the joint posterior of the model components θ 1 , : : : , θ T and the hyperparameters Φ: Pr.θ 1:T , Φ | Y 1:T / = Pr.θ 1:T | Φ, Y 1:T /Pr.Φ | Y 1:T /: If both f.θ t , v t / and g.θ t−1 , w t / were linear functions, then, conditionally on Φ, we could sample from the marginal posterior of the state variables Pr.θ 1:T | Φ, Y 1:T / by using the wellknown forward filtering-backward sampling algorithm (Frühwirth-Schnatter, 1994). However, the evolution function g.θ t−1 , w t / that is defined by expression (2) is non-linear because of the combination of φ p and X t−p in expression (2e). The form of the observation equation (6) also contains a non-linear combination of δ pt and X t−p . Therefore, we use linear approximations of the observation and state equations: This linearization leads to approximate forward filtering-backward sampling recursions, detailed in Appendix A.
In general, we expect the TVAR evolution variance W φ to be small, so the coefficients φ 1t , : : : , φ Pt will be only weakly correlated with the other state variables and our uncertainty about the coefficients will decrease rapidly over time. Since the other components of the evolution function g.θ t−1 , w t / are linear and the observation errors v t and joint state evolution process w t are normal, forward filtering and backward sampling based on the linear approximation are expected to be very accurate. A simulation study showed that the linear approximation provides excellent filtering and smoothing performance, even when all components of the model evolve much more rapidly than expected (see the on-line supplementary material). The linear approximation sometimes struggles to distinguish the TVAR coefficients φ 1t , : : : , φ Pt from the intervention effects δ 1t , : : : , δ Pt in the auto-correlation intervention model when both sets of coefficients evolve rapidly. In the case of the NAO, we expect only slow evolution of the TVAR coefficients, and little or no change in the intervention effects. In this scenario, the linearized approximation performs very well.
The marginal posterior of the hyperparameters Pr.Φ | Y 1:T / is proportional to Pr.Φ | Y 1:T / ∝ Pr.Y 1:T | Φ/Pr.Φ/: The marginal likelihood Pr.Y 1:T | Φ/ can be decomposed as Pr.Y t | Y 1:t−1 , Φ/: The forward filtering recursions in Appendix A include an expression for the one-step-ahead forecast distribution Pr.Y t | Y 1:t−1 , Φ/. So the marginal likelihood can be evaluated analytically. Therefore, the joint posterior Pr.θ 1:T , Φ | Y 1:T / can be efficiently sampled by combining forward filtering-backward sampling with a Metropolis-Hastings scheme targeting Pr.Φ | Y 1:T / as follows.
(a) Let j denote a sample index, at j = 1; (i) sample starting values Φ .1/ ; (ii) sample θ .1/ 1:T |Φ .1/ , Y 1:T by backward sampling. (b) For j = 2, : : : , J: (iii) sample θ .j/ 1:T | Φ .j/ , Y 1:T by backward sampling. In practice, it is not necessary to perform backward sampling for the state θ 1:T for every sample .j/. As with any Markov chain Monte Carlo approach, there is likely to be significant autocorrelation between subsequent samples of the hyperparameters Φ .j/ . In the interest of saving storage and computation time, it is sufficient to sample the state θ 1:T for a subset of the Φ .j/ .

Alternative approaches
Conditionally on the TVAR coefficients φ 1t , : : : , φ Pt , the model defined by expressions (1) and (2) is a normal dynamic linear model. We could split the state vector θ t into two parts θ Å t = .μ t , β t , ψ 1t , ψ Å 1t , : : : , ψ Kt , ψ Å Kt , X t , : : : , X t−P+1 / and φ Å t = .φ 1t , : : : , φ Pt / , and then alternate between forward filtering and backward sampling for each part, conditionally on the other. Gibbs sampling steps could be used to sample the hyperparameters Φ (West and Harrison (1997), chapter 15). This approach provides exact sampling from the required posterior distribution but has two drawbacks compared with the approximate approach that is proposed here. First, backward sampling must be performed at every iteration, making this approach computationally expensive. Second, Gibbs sampling based on the full conditional distributions of the hyperparameters will tend to mix very slowly, especially for long time series where the data completely overwhelm the prior.
Particle filtering methods provide tools for inference in general non-linear and non-normal state space models (Doucet and Johansen, 2011). However, particle filters are computationally expensive and suffer from problems of 'particle degeneracy', i.e. the state θ t will eventually be represented by a single particle at times t T . Since we are interested in what happened at all times t = 1, : : : , T , we also require particle smoothing to overcome the degeneracy problem (Godsill et al., 2004;Briers et al., 2010). Particle smoothing is even more computationally expensive, making an alternative approach highly desirable. The problem of efficient inference for unknown hyperparameters also remains an active topic for research in sequential Monte Carlo methods (Chopin et al., 2013).

Model selection
For some applications, it will be possible to choose between the mean and auto-correlation intervention models on the basis of posterior predictive diagnostics, i.e. whether the model reproduces the observed behaviour. The posterior distributions of the hyperparameters Φ can also be useful for choosing between models; for example, is the coupling effect variance W δ negligible? More formally, we can compare the two intervention models by evaluating the Bayes factor where M μ is the model including an intervention on the mean, and M X is the model including an intervention on the temporal dependence structure. The Bayes factor is defined as the ratio of the marginal likelihoods of the competing models (Kass and Raftery, 1995). It is useful to define η t = μ t + Σ k ψ kt , which we refer to as the systematic component of the observed process. The relative contributions to the variability between coupled periods of the systematic component η t , the irregular component X t , the coupling effects δ t or δ pt .p = 1, : : : , P/ and observation error v t are of particular interest. The means of the systematic and irregular components during period τ in the jth sample arē where n is the number of time steps in τ . The means of the coupling effects during period τ in the mean and auto-correlation models respectively arē .10/

The contribution due to observation error is
The prior expectations of the irregular component X t and the coupling effects δ t or δ pt (p = 1, : : : , P) during any period τ are 0 by expressions (2e), (5) and (7), i.e. E.X t / = 0 and E.δ t / = E.δ pt / = 0 for all t. In general E.η t / = 0, so for the systematic component η t it is more useful to consider the anomalies over all similar periods: where D = {t ∈ 1, : : : , T : d.t/ = d.s/ and s ∈ τ } and d.t/ is the day of the year at time t. The sample meansη .11/ provide a summary of the posterior expected contribution of each component during the period τ . Quantiles can also be computed over the samples to form credible intervals for the contribution of each component. However, since our model is non-stationary, we require an alternative summary of the variance components. In particular, we are interested in the proportion of the interannual variance of the winter (December-January-February) mean of the NAO index explained by each component. Let τ i be the ith winter period. We propose to perform an analysis of variance of the observed meansȳ τ i = .1=N i /Σ t∈τ i y t for each sample j by using the component meansη (9) and (10) as explanatory variables. The analysis of variance leads to four sums of squares for each sample j, corresponding to the sum of squared deviations explained by the systematic η t -and irregular X t -components, the coupling effects δ t or δ pt (p = 1, : : : , P) and observation errors v t in each sample trajectory. Posterior summaries over the J samples summarize the overall contributions of each component to the variability between coupled periods.

Can we make predictions using unobserved components?
Knowledge of the unobserved component through the intervention effect δ t should provide useful predictability within coupled periods. The model that was proposed in Section 3.2 also allows for dependence between successive coupled periods, so knowledge of the unobserved component during one coupled period may also be useful for predicting the next. The k-stepahead forecast distribution given data up to time t can be sampled exactly by using the recursions in Appendix A. The correlation between the data and the forecast means provides a simple measure of forecast performance.

Results
The Metropolis-Hastings sampler that was outlined in Section 5 was used to draw 1000 samples from each of the joint posteriors Pr.θ 1:T , Φ | Y 1:T , M μ / and Pr.θ 1:T , Φ | Y 1:T , M X /. Full details of the sampling design, proposal distributions, diagnostic trace plots and posterior density plots are given in the on-line supplementary material. Both models converge to stable distributions and mix efficiently; however, the burn-in period can be very long depending on the initial values of the hyperparameters Φ.
Despite deliberately vague prior distributions, the posterior distributions of the intervention parameters α and γ are quite sharp for both models. Fig. 3 visualizes the posterior distribution of the intervention λ t for each model. In the mean intervention model M μ , an unobserved component acts strongly on the NAO between December and February and into March. There is almost no evidence of coupling between May and October. In the auto-correlation intervention model M X the situation is reversed. The inverted intervention structure is unexpected but still To assess the identifiability of the various model components, particularly the coupling effects, we computed correlation matrices for the states θ .j/ 1:T | Φ .j/ , Y 1:T for each sample j. On average across the 1000 sample covariance matrices, the state variables in both models are all uncorrelated with one another. In particular, the mean intervention effect δ t is almost completely uncorrelated with the irregular component X t (90% credible interval of correlation .−0:02, 0:06/), and only ever weakly correlated with the mean component μ t (90% credible interval .−0:28, 0:20/). Whereas the auto-correlation intervention effects δ 1t , : : : , δ 5t are uncorrelated with the other state variables on average, they may be strongly correlated or anticorrelated with the mean μ t and the auto-correlation coefficients φ 1t , : : : , φ 5t . Further investigation showed that these strong associations were the result of the slow rate of change of these parameters, since sampling multiple state trajectories θ 1:T from any single sample of the hyperparameters Φ .j/ produced a similar range of sample correlations.
Posterior predictive diagnostics were used to check the performance of each model in capturing the observed structure of the NAO. In particular, we are interested whether the model can reproduce the seasonal contrast in the interannual variance and auto-correlation structures in There is a small difference between the extended summer (April-November) and extended winter (December-March) auto-correlation functions, but much less than observed in the data. The inverted intervention structure in Fig. 3 is an attempt to exploit the extended summer (April-November) period to distinguish the small intervention effects δ p . Similar checks (which are not shown) suggest that both models can adequately capture the annual cycle in the NAO, indicating that our choice of K = 2 harmonics was reasonable.
The posterior predictive checks strongly favour the mean intervention model over the autocorrelation intervention model. The mean intervention can reproduce the observed behaviour; the auto-correlation intervention cannot. The Bayes factor of B = 1096 also provides extremely strong evidence in favour of the mean intervention model, i.e. the observed data are over 1000 times more likely to have arisen from the mean intervention model. We conclude that the most likely explanation for the observed behaviour of the NAO index is a transient change in the mean level during the extended winter (December-March) season. The remainder of our analysis focuses on interpreting only the mean intervention model. Surprisingly for such a complex phenomenon, the mean, trend and seasonal components of the NAO index show very limited evidence of non-stationarity. Fig. 5 shows some posterior trajectories θ .j/ 1:T from each component. There is evidence of a fairly constant trend leading to a reduction in the mean level of the NAO of around 0.8 hPa between 1979 and 2017. The posterior distribution of the trend itself suggests that the rate of decrease in the NAO mean peaked around 1993-1994 at around 0:03 hPa year −1 (−0:07, 0:01), since when the trend has gradually weakened. The amplitudes of the annual and semiannual cycles are almost constant (likewise the phases). The 0.95-quantile of the posterior distribution of the mean evolution standard deviation √ W μ is 0.005 hPa, so changes in excess of 0:2 hPa year −1 to the mean and seasonal components are not ruled out under the random-walk hypothesis. There is no evidence of non-stationarity in the auto-regressive coefficients φ 1 , : : : , φ 5 which are effectively constant throughout the study period. This suggests that the day-to-day variation in the NAO can be Autoregressive coefficient  adequately represented by an auto-regressive process rather than a TVAR process. However, this is a useful conclusion given the observed seasonal auto-correlation structure in Fig. 1.

Quantifying the effect of coupling
The posterior mean estimate of the intervention effect standard deviation √ W δ μ is 0.43 (0.33-0.53), indicating a very active process, contributing substantial additional interannual variability during the extended winter season (December-March). Table 4 contains the results of the analysis of variance that was proposed in Section 5 for the mean intervention model M μ . The effect of coupling δ t explains around 66% of the observed variation in the winter (December-January-February) means. Accumulated short-term variability that is captured by the TVAR residuals X t explains around 33% of the interannual variability. Despite the trend that is visible in Fig. 5, the contribution of the mean and seasonal components is negligible. Together they account for a maximum of 5% of the interannual variability in winter (December-January-February). In contrast, the mean and seasonal components account for around 15% of interannual variability in summer (June-July-August) when coupling has no effect and the day-to-day variability is reduced. The contribution of measurement error is negligible. Fig. 6 shows the posterior mean contribution of each component to each observed winter (December-January-February) mean level. This is an important and useful advance over existing methods in climate science that estimate only the fraction of total variance explained by each component. The weak negative trend in the mean component μ t is clearly visible. Both the absolute and the relative contributions of the irregular component X t and the coupling effect δ t vary between years, but both components usually have the same sign. This is a product of the limited data that are available to estimate the components during each extended winter (December-March). If the coupling signal cannot be clearly identified during a particular season, then the contribution to the seasonal mean will be split approximately according to the analysis of variance in Table 4 and the two components will have the same sign. The fact that the relative contribution of each component varies widely in Fig. 6 indicates that the model can separate the coupling effect from the noise of the irregular component.

Forecasting the winter North Atlantic oscillation
The posterior mean estimate of the coupling effect coefficient ϕ μ is 0.994 (90% credible interval 0.991-0.997). In terms of interannual variability, this is equivalent to a correlation of around 0.19 (0.05-0.38) between December-January-February means, suggesting limited evidence of persistence and therefore predictability between seasons. However, if we can learn about the coupling effect sufficiently quickly during a specific coupled period, then we can use that knowledge to provide more skilful forecasts for the rest of the period. Fig. 3 suggests that the system is at least partially coupled from the beginning of November until around the middle of April. Using the forecasting recursions in Appendix A, we obtained forecasts beginning each day from November 1st to February 1st until the end of the fully coupled period on February 28th for every winter between 1987 and 2016. Fig. 7 shows the correlation between the forecast and observed means. By the beginning of December, the correlation approaches 0.5 for the 92-day forecast of the mean NAO to February 28th. This correlation approaches that achieved by computationally expensive numerical weather prediction models Siegert et al., 2016). The correlation increases slightly as more observations are assimilated during December. However, as more observations are assimilated, the forecast period decreases and we are essentially predicting weather noise, so the correlation does not increase further. Fig. 7 also compares forecasts of the 92-day December-January-February winter mean, initialized on December 1st each year, with the observed mean NAO index for the same periods. The model predicts the 2010, 2011 and 2012 winter seasons with remarkable accuracy and captures the general pattern during the 1990s. However, it fails to predict the extreme winter of 2009-2010. Fig. 8 plots deseasonalized observations of winter 2009-2010 .Y t − E.η t | Y 1:T //. Deseasonalizing the observations leaves only the contributions from the irregular component X t and the coupling effect δ t , which represent processes on different timescales. The irregular component X t captures high frequency fluctuations, whereas the coupling effect δ t captures any overall departure from the seasonal mean. From the middle of December onwards, the mean of the deseasonalized data is clearly negative, which the model attributes to the coupling effect δ t . Since the seasonal forecasts in Fig. 7 were based on information up to November 30th, it is unsurprising that a fairly normal winter was forecast. In contrast, in winter 2010-2011 (Fig. 8), a strong negative signal is visible in November which the model can exploit to forecast skilfully the remainder of the December-January-February season. Winter 2010 also illustrates the time varying nature of the coupling effect δ t , which starts strongly negative early in the season but weakens from mid-January onwards.

Discussion
In this study we have developed Bayesian state space methods for diagnosing predictability in intermittently coupled systems. Coupling is represented by a transient intervention whose timing and duration are inferred from the data. Interventions to either the mean or temporal dependence structure are considered. The effect of intermittent coupling is modelled as a dynamic process rather than a sequence of constant and independent effects. Latent TVAR components are proposed to capture any inherent non-stationarity in the temporal dependence structure. A linearized approximation is proposed that allows efficient forward filtering and backward sampling for models containing latent TVAR components, without requiring complicated and computationally expensive sequential Monte Carlo methods. In addition, we develop tools for posterior inference in intermittently coupled systems, including evaluating the evidence of a coupling effect, attribution of historical variation in the system and demonstrating potential predictability.
We applied the proposed model and inference methods to diagnose excess winter time variability in the NAO. Existing methods in climate science cannot distinguish between transient changes in the mean or temporal dependence structure. The model strongly points to transient changes in the mean level of the NAO during a period beginning sometime in November and ending around the middle of April. This is an important conclusion given that the excess winter time variability in the NAO is usually characterized by increased temporal dependence. The mean level of the NAO also appears to change on decadal timescales, in addition to a fairly stable annual cycle and the transient changes in winter time. The model can also separate the coupling effect from accumulated day-to-day variability in individual seasons. For the NAO, the two effects actually oppose each other in some seasons.
Like latent auto-regressive components, latent TVAR components improve the interpretability of structural time series models by avoiding the need to redefine the mean level of the observed process. In addition, latent TVAR components permit efficient recursive estimation of the auto-regressive parameters and include standard latent auto-regressive components as a special case when the evolution variance is 0. For the NAO, we found little evidence of changes in the auto-regressive structure throughout the study period, so a standard latent auto-regressive component could be used to represent day-to-day variability. However, the fact that we can confirm that auto-regressive structure is constant on decadal timescales is also a useful conclusion.
The model proposed for intermittently coupled systems differs from standard intervention analysis by modelling the effect of repeated coupling events as a dynamic process, rather than a series of independent events. This allows knowledge that is gained during one coupled period to inform inferences for the next. By modelling the coupling effect as a dynamic process, the effect can also vary within individual coupled periods rather than being assumed constant. Climate scientists usually assume that any coupling effect is constant throughout an arbitrarily defined season. We have shown that the coupling effect on the NAO can vary substantially throughout a single season.
Modelling the effect of coupling as a dynamic process also makes the model robust to minor variations in the timing and duration of the coupled period. However, the specification of a fixed coupling period remains a limitation. Hidden Markov and semi-Markov models are widely used in similar seasonal state switching scenarios to allow for changes in onset and duration (e.g. Carey-Smith et al. (2014)). Standard hidden Markov models assume instantaneous switching between states. Although such an assumption may be acceptable for some applications, we do not consider it plausible for the NAO. A completely general alternative would be a reversible jump Markov chain Monte Carlo scheme (Green, 1995). In such a scheme, coupling events could be estimated with varying onset, duration or other parameterized structural changes. However, unless the timing of coupling events varies dramatically, the additional cost and complexity of a reversible jump scheme seems unnecessary. The on-line Bayesian change point methods that were proposed by Fearnhead and Liu (2011) might provide a more efficient approach.
In the methodology that was developed here, we have allowed for non-stationarity in the mean and the temporal dependence structure, but not in the variance. Stochastic volatility models and related auto-regressive conditional heteroscedasticity and generalized auto-regressive conditional heteroscedasticity models have been widely studied and applied, particularly in economics. Masala (2015) applied a generalized auto-regressive conditional heteroscedasticity model to stochastic modelling of the NAO but found that its performance was poor. Efficient filtering and smoothing is possible for time varying observation error variance (West and Harrison (1997), chapter 10.8). However, fully conjugate models that admit analytic filtering and smoothing for time varying state evolution variances are not possible, even in the linear normal case. Of particular interest are changes in the residual TVAR evolution variance W Xt that drives short-term variability in the system. Sequential Monte Carlo methods or further approximations are required to model time varying evolution variances. G t = @g @θ θ t−1 ,ŵ t , H t = @g @w θ t−1 ,ŵ t , After observing Y t , the posterior distribution of the state vector at time t is where e t = Y t − f t and A t = R t F t =Q t .

A.2. Backward sampling
The joint posterior θ 1:T | Y 1:T , Φ can be sampled recursively as follows. T −k+1 . The quantities m t , C t , a t , R t and G t are obtained from the filtering recursions.

A.3. Forecasting
The k-step-ahead forecast distribution given data up to time t can be sampled sequentially as follows.