Do environmental concerns affect commuting choices?: hybrid choice modelling with household survey data

To meet ambitious climate change goals governments must encourage behavioural change alongside technological progress. Designing effective policy requires a thorough understanding of the factors that drive behaviours. In an effort to understand the role of environmental attitudes better we estimate a hybrid choice model (HCM) for commuting mode choice by using a large household survey data set. HCMs combine traditional discrete choice models with a structural equation model to integrate latent variables, such as attitudes, into the choice process. To date HCMs have utilized small bespoke data sets, beset with problems of selection and limited generalizability. To overcome these problems we demonstrate the feasibility of using this valuable modelling approach with nationally representative data. Our results suggest that environmental attitudes have an important influence on commute mode choice, and this can be exploited by governments looking to add to their climate change policy toolbox in an effort to change travel behaviours.


Introduction
Tackling climate change is one of the most important challenges faced by governments around the world. The USA has committed to reducing greenhouse gas emissions by 17% below 2005 levels by 2020, and in the UK the Climate Change Act 2008 commits the government to cut emissions by at least 80% by 2050. Achieving these targets will not be possible via technical progress alone; it will also require a substantial behavioural change on the part of individuals and households. A prerequisite to designing effective policy interventions is a thorough understanding of the factors that drive behaviours and ultimately decisions. One topic that has been the subject of much discussion is the extent to which individual environmental concerns can motivate behaviour change. The majority of the literature is pessimistic in this regard; although the public express concern about climate change, this is rarely sufficiently strong to bring about change towards more sustainable behaviours, especially when these changes require personal sacrifice (Gifford, 2011). Nowhere is this more apparent than in our seemingly unshakeable attachment to the private car.
In this paper we use hybrid choice modelling to explore the effects of environmental concerns on choice of commuting transport mode in England. Hybrid choice models (HCMs) combine traditional discrete choice models (DCMs) with a structural equation model (SEM) to integrate latent variables, such as attitudes and other psychological constructs, into the choice process. Our overall aim is to improve our understanding of the way that people make travel choices; and we make three main contributions to the literature. Firstly, our study is a rare example of a study which attempts to evaluate the importance of environmental beliefs for travel behaviours. Secondly, a major innovation is the use of a large nationally representative household survey data set for model estimation; we also replicate the modelling with a second such data set, as a robustness check on the results. To date HCMs have been estimated by using relatively small data sets constructed to tackle the question in hand. These bespoke data have limited generalizability, are prone to substantial selection problems and focusing effects, and include little information on individual characteristics with which to control for heterogeneity. Thirdly, HCM studies of mode choice generally devote little or no attention to the theoretical model of decision making that underlies the empirical work; in contrast we explain how the attitude-behaviour-context (ABC) model (Stern, 2000;Stern and Oskamp, 1987) is an appropriate framework for our HCM of commuting mode choice incorporating latent environmentalism.
Domestic transport accounts for 25% of the UK's carbon dioxide emissions, more than half of which are from the private car (Department for Transport, 2008). Thus meeting climate change goals necessitates a shift away from the car and towards more sustainable modes such as public transport, walking and cycling. The regular commuting journey is a key arena in which to study these choices; 57% of all commute trips are by car (Department for Transport, 2012). A recent report for the UK Department for Transport reveals that the implications of climate change are not widely understood and that most people are unaware of their own contribution to the problem (King et al., 2009). However, knowledge alone is not an adequate antecedent to behavioural change; in their systematic review of interventions to reduce car use Graham-Rowe et al. (2011) found no evidence that providing environmental information is effective.
Traditionally transport choice modelling has employed DCMs; these are based in randomutility theory, which is an economic framework in which time and cost are the key variables (see for example Train (1980)). Random-utility theory has been criticized for its fundamental assumption that consumers are a rational 'optimizing black box' (Morikawa et al., 2002). HCMs were first proposed as an extension to random-utility theory in the 1980s, as a way of better understanding consumer behaviour by incorporating latent variables, such as preferences and attitudes, in the choice process (McFadden, 1986;Ben-Akiva and Boccara, 1987). More broadly HCMs can be seen as a reflection of the growing popularity of behavioural economics, which incorporate psychological concepts into economic analysis to improve our understanding of decision making under uncertainty (Tversky and Kahneman, 1974).
Empirical applications of HCMs have developed largely from 2000 onwards, and Golob (2003) provided coverage within his excellent review of SEMs in transport research. Morikawa et al. (2002) found a significant influence of latent variables for comfort and convenience on the decision to use rail or car for intercity travel between cities in the Netherlands. Temme et al. (2008) found that latent preferences for comfort, convenience, flexibility and safety affect the travel mode choices of a market research survey panel in Germany. Yáñez et al. (2010) considered the effects of three latent variables (accessibility, reliability and comfort) on commute mode following the introduction of a new urban transport system in Santiago, Chile. Córdoba and Jaramillo (2012) demonstrated the importance of a 'personality measure' on the commute mode choices of staff at the National University of Columbia. Johansson et al. (2006) is the only HCM study that we know of that has incorporated any measure of environmentalism in a mode choice model. The data were from a postal survey of commuters in Sweden, and the environmentalism variable was inferred from measures of the frequency with which the respondents recycle glass, paper, batteries and metal. This variable was found not to be significant in the choice of car versus bus but has marginal significance in the train-bus choice. This neglect of environmental variables is a serious shortcoming given the key role of personal travel choices in climate change. Environmental attitudes are likely to influence the utility that an individual derives from different travel modes and hence ultimately may affect mode choice. The relative importance of environmental attitudes alongside other influences such as fiscal incentives is of key interest to policy makers.

Decision-making model
Existing empirical applications of HCMs tend not to be based in clear theoretical frameworks and thus it is often difficult to interpret the results, especially in relation to inferring causal relationships. This is particularly problematic for SEMs, because these models do not provide a means of establishing causality but rather can only confirm relationships that the researcher must impose from external knowledge (Sánchez et al., 2005;Bollen and Pearl, 2013). Environmental concerns reflect how we feel about the environment and the way that we are predicated to behave with regard to it. These are complex phenomena that combine elements of prosocial preferences, risk and time preference, selfish regard for one's own (and one's children's) future, social pressures and norms. 'Environmentalism' is mediated by knowledge and institutions, which influence the immediate costs to individuals; it also involves interaction between attitudes and behaviours.
In our HCM we propose that 'environmentalism' is a latent construct that we cannot observe directly. Instead it is represented by a set of observable indicators that measure both attitudes towards the environment and climate change, as well as certain environmental behaviours; these behaviours are not directly related to commuting but relate to other areas of life such as recycling, and use of carrier bags and home energy. These indicators are used in an SEM, which is combined with a DCM to integrate the latent variable(s) for environmentalism into a model for commuting mode choice.
The psychological literature explains that attitudes and behaviours are related but theoretically distinct. A behaviour is an observable action, e.g. switching a tap off rather than letting it drip, or putting on an extra jumper rather than turning the heating up. Attitudes are the subjective importance that is attached to different issues, e.g. the extent to which a person believes that climate change is a cause for concern, or the extent to which they believe that the environmental crisis has been exaggerated. The attitude-behaviour relationship is a core topic in psychology (Kraus, 1995); in general it is understood that behaviours are driven by intention, and intention is, in turn, a function of attitudes (Ajzen and Fishbein, 1977). For example, people with proenvironmental attitudes might have a strong intention not to use the car for short trips and hence act in a way that is consistent with this, choosing instead to use public transport or to walk or cycle. However, there may also be discrepancies between attitudes and behaviours (Ajzen and Fishbein, 1970) and this has gained empirical support within the environmental context (Oskamp et al., 1991;Gardner and Abraham, 2008). Kline (1988) stressed the importance of contextual factors that can weaken the attitudebehaviour connection, arguing that people will be less willing to act in a proenvironmental way when this is costly or inconvenient, or when they do not feel that their personal contribution can make much difference and when they perceive that others are not behaving that way (Oskamp et al., 1991). Also, many people believe that the government is responsible for solving environmental problems and rely on this to justify their own behaviours (Stern et al., 1985). Although it is usually thought that attitudes precede behaviour, behaviours can change attitudes; for example, individuals with proenvironmental beliefs who nevertheless use the car for short trips might change their attitudes in an attempt to rationalize their mode choice and to reduce cognitive dissonance (Festinger, 1962).
There are various theoretical models of decision making that support the integration of psychological variables into transport mode choice models. Gardner and Abraham (2010) tested Ajzen's (1991) theory of planned behaviours for local car use in a small UK city (Brighton and Hove). Bamberg and Schmidt (2003) compared the theory of planned behaviours with the theory of interpersonal behaviour (Triandis, 1977) and the norm activation model (Schwartz, 1977) for car use in a sample of students. Neither of these studies found much support for the influence of environmental attitudes; perceived personal benefits such as convenience outweigh environmental opinions and car use is so habitualized that there is little or no moral dimension to the choice.
The ABC model (Stern, 2000;Stern and Oskamp, 1987) was developed specifically to explain environmentally significant behaviours and provides an ideal theoretical framework for our HCM of commuting mode choice. Behaviour in this model is an interactive product of 'internal' attitudes, such as concerns over climate change, and 'external' contextual factors such as the transport costs, and institutional constraints such as the local availability of transport choices. Hence external factors (like time and cost) will moderate the effect of environmental beliefs, and the relative importance of psychological and contextual factors will depend on the behaviour in question. Attitudes have been found to have stronger effects for low constraint behaviours that are cheap or easy to change, such as curbside recycling or the use of low energy light bulbs (Stern and Oskamp, 1987;Guagnano et al., 1995). We would expect them to have less influence on behaviours like car use, which have high personal benefits, are habitualized and are seen as difficult to change (Collins and Chambers, 2005). Nevertheless the relative influence of these different sets of factors is a key issue for designing policies to change behaviours, and this is where our study can make a clear contribution.
A schematic diagram of our HCM is shown in Fig. 1; this illustrates how a traditional DCM is combined with a latent variable model for environmentalism. The unobservable latent variable(s) for environmentalism are identified via observed indicators that reflect environmental attitudes or behaviours; the number of latent variables and the classification of indicators are determined via both exploratory and confirmatory factor analyses, as explained in the next section. Environmentalism is also determined by observed sociodemographic characteristics such as age, sex and household income; these are important context variables; for example whether or not the individual has young children will influence how much personal (in)convenience they might experience from using public transport rather than a car. Latent utility from commuting mode is determined by environmentalism and also directly by sociodemographic variables and the key mode attributes of time and cost, which are again important measures of context. We observe the final mode choice decision as a manifestation of the underlying latent utility. The statistical basis of this model and its estimation are explained in the next section.
The programs that were used to analyse the data can be obtained from http://wileyonlinelibary.com/journal/rss-datasets

Specification and estimation of hybrid choice model
3.1. Structural model u Å ij is the unobserved (latent) conditional indirect utility for mode j .j = 0, : : : , J − 1/ for individual i .i = 1, : : : , n/. In the empirical analysis we assume a linear-in-parameters specification: , observed variables; , unobservable variables; , cause-and-effect relationships; , measurement equations where u Å j is an n × 1 column vector of individual utilities u Å ij ; x i is a K x × 1 vector of K x individualspecific observed variables, such as age and income, and x is a K x × n matrix obtained by horizontal concatenation of x i ; z i is a K z × 1 vector of observed context variables for individual i, relating to mode, such as availability of local public transport, and z is a K z × n matrix obtained by horizontal concatenation of z i ; η i is a Q × 1 vector of Q individual-specific latent variables, which represent the unobservable environmentalism of the individual (environmental attitudes and behaviours), and η is a Q × n matrix obtained by horizontal concatenation of η i ; β jx , β jz and β jη are respectively K x × 1, K z × 1 and Q × 1 vectors of parameters to be estimated; ν j is an n × 1 column vector of random mean 0 error terms. The latent variables in matrix η are also assumed to depend linearly on the vector of individual-specific observed variables: where γ is a Q × K x matrix of parameters to be estimated and ξ is a matrix of Q × n random mean 0 error terms.

Measurement model
We do not observe u Å ij ; what we observe is the choice that is made by the decision maker whether to use mode j or an alternative l .j, l ∈ J/. The observed decision variable is defined as In our empirical analysis J = 2; the individual either commutes by car .j = 1/ or by public transport .j = 0/, so without loss of generality we can assume that u Å i0 = 0. In subsequent discussion we suppress the indexation for the mode; thus u Å i1 is denoted as u Å i . The decision of the individual is modelled as where F.·/ is the cumulative distribution function for the measurement error ν. We treat F.·/ as a normal distribution function, and therefore estimate a probit model. Let η = .η 1 , : : : , η Q /, where η q is an n × 1 vector which contains as components the latent variables for all the individuals. We do not directly observe η q (q ∈ {1, : : : , Q}); what we do observe are different indicators for η q . So, for example, we do not observe directly how 'green' people are; what we do observe are their responses to questions reflecting their environmental attitudes and behaviours, and these can be considered indicators of their underlying environmentalism. Let Y q s be an n × 1 vector of the indicators for η q , where s = 1, : : : , m q , such that m q 2, i.e. we need a minimum of two indicators for each latent variable. The observed indicators are related to the unobserved latent variable as where the number α q s is the factor loading from the factor analysis, which can be interpreted as the amount of information that the indicator Y q s contains about η q ; " q s is an n × 1 vector of zero-mean measurement errors, which captures the difference between the observed indicators and the unobserved variable; the intercept μ q s is an n × 1 vector with all elements equal, i.e. no dependence on individual.
We have a set of measurement equations, similar to equation (4) for each of the latent variables in the matrix η. In matrix notation: where Y is a matrix of n × M indicators, such that M = m 1 + m 2 +: : : + m q ; α is a matrix of Q × M factor loadings; ε is an n × M matrix of measurement errors; μ is an n × M matrix containing the M intercepts. For example, if we assume Q = 2, m 1 = m 2 = 2, then: where, as above, Y q s , η q , " q s and μ q s are themselves n × 1 vectors. The factor loadings in equation (4) can be identified only up to a scale, so we normalize them according to α q 1 = 1 ∀ q ∈ {1, : : : , Q}. Further we cannot separately identify the mean of the latent variables, E.η/, and intercepts μ: we need to normalize one of them; we assume that E.η/ = 0 and identify μ.
We employ both exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) to explore and verify the latent variables that are to be used in our HCM. All model estimation and factor analyses are carried out by using Mplus version 7.11 (see below). In the EFA no preconceived structure is imposed, and the indicators are allowed to load freely, thus determining the dimension of matrix η (i.e. the value of Q). Factor extraction in the EFA is done via varimax rotation (Kaiser, 1958) and model selection criteria and diagnostic statistics are discussed in Section 5. In contrast, in the CFA we constrain the model to comply with prior beliefs based on evidence from the psychological literature that attitudes and behaviours are theoretically distinct both generally (Ajzen andFishbein, 1977, 2005;Kraus, 1995) and in the context of environmental behaviours (Stern, 2000). Thus the indicators are split into two vectors .Q = 2/ Own behaviour contributes to climate change: agree or disagree Pay more Prepared to pay more for environmentaly sympathetic products: agree or disagree Disaster World on course for major environmental disaster: agree or disagree Exaggerate The environmental crisis has been exaggerated: agree or disagree Control Climate change is beyond our control: agree or disagree Future The effects of climate change are too far in the future: agree or disagree Lifestyle Changes made have to fit in with current lifestyle: agree or disagree Others Not worth doing anything unless others do the same: agree or disagree Britain Not worth UK trying to do anything about climate change: agree or disagree 30 years Climate change will affect UK in next 30 years: agree or disagree †Source: unless otherwise stated variables are obtained from the UK Household Longitudinal Study, wave 1.
according to the question wording, with one set forced to load onto an attitudes factor, and the other onto a behaviours factor. The definitions and split are provided in Table 1.
Equations (1) and (3) give the standard DCM, and equations (2) and (4) give the latent variable model; together these equations define the HCM. It is worth pointing out here that an alternative specification to the HCM would be to include the indicator variables Y directly in the DCM instead of including the latent variables η; this is analogous to treating the indicators as direct measures of environmentalism rather than as functions of it. This is inappropriate for three main reasons. Firstly, the indicator variables may be correlated with the errors from the DCM due to omitted (unobservable) effects and this would lead to endogeneity bias. Secondly, the latent variables that the indicators represent are measured with error; thus their direct inclusion in the DCM can lead to inconsistent estimates (Ashok et al., 2002). Thirdly, the HCM specification is a closer representation of the psychological decision-making framework. Attitudes are inherently latent and strong agreement with a proenvironmental statement does not necessarily translate into a causal relationship with choice. Attitudes and related behaviours are therefore not direct antecedents of mode choice but are indirectly related via, inter alia, latent environmental concerns (Daly et al., 2012).
A further advantage of our approach arises from simultaneous estimation of the DCM and latent variable parts of the model. It is also possible to use an SEM for environmentalism and a separate choice model, which includes latent variables from the SEM, but to estimate these models sequentially (this is the approach that was taken by for example Johansson et al. (2006) and Choo and Mokhtarian (2004)). However, in this case the SEM would not use information on the observed choices to inform the latent variable part of the model, whereas simultaneous estimation of the choice part and latent variable part makes fuller use of the information and hence is more efficient than a sequential approach (Morikawi et al., 2002;Daly et al., 2012).

Identification, estimation and diagnostic statistics
To be able to identify the parameters in the system of equations (1)-(4) we need to make the following assumptions.
Assumption 1. The error term ν is independent of x, z and η, ξ is independent of x and ε is independent of η.
Assumption 2. The errors terms ν, ξ and ε are not correlated with each other.
Assumption 3. The variance-covariance E.νν / is diagonal with off-diagonal elements 0 and, similarly, there are no correlations between the different matrix elements of ξ and ε.
The system of equations (1)-(4) that comprise the HCM are estimated simultaneously, using the asymptotically distribution-free weighted least squares (WLS) estimator (Browne, 1984;Muthén, 1983Muthén, , 1984. WLS is chosen over the more commonly used maximum likelihood approach because the latter requires the indicator variables to be continuous and this is not so in our application, as we have a number of dichotomous and ordinal variables (see the next section). The WLS estimator for categorical indicator variables works in three steps. In the first step, a set of probit regressions are run for each categorical indicator (i.e. all the observed indicators that are given by equation (4) and the observed mode choice dummy d i ), followed by a set of bivariate probit regressions for each pair of categorical indicators. The thresholds, for the measurement equations (4) and the mode choice equation (3), are obtained from these probit regressions. In the second step the estimated thresholds, conditional mean of the indicators and the conditional variance-covariance matrix are used to form weighting matrix W. In the third step W is used to estimate the parameters by using the WLS method. The WLS function is optimized numerically by using an iterative quasi-Newton technique. The estimator is distribution free in the sense that the final estimated parameters of the model (other than the thresholds) do not require normality.
The identification of the model and the asymptotic properties of the WLS estimator are discussed in Muthén and Satorra (1995); and the exact form of the WLS function that is used is given in Muthén and Muthén (2016). Estimation is carried out by using the WLSMV estimator in Mplus version 7.11. As a check of consistency, we also replicated estimates for one of our data sets (the British Household Panel Survey-see below) using the gsem estimator in Stata version 13 (StataCorp., 2013). We cannot rely on the asymptotic properties of WLS when the sample size is small but (as described in the next section) our estimation samples are over 6000, which is unusually large for HCM applications. Given the structure of our data (see Section 3) the estimation takes account of clustering of individuals in households.
As is common in the SEM literature we rely on a number of diagnostic statistics to determine the adequacy of model fit: firstly, the root-mean-square error of approximation RMSEA, which shows the amount of unexplained variance (Steiger and Lind, 1980) and ranges from 0 to 1 with smaller values indicating better fit; secondly, the comparative fit index CFI, which considers the discrepancy between the data and the hypothesized model, while adjusting for sample size (Bentler, 1990); thirdly, the Tucker-Lewis reliability index TLI, which is an adjusted version of the normed fit index of discrepancy between the χ 2 -value of the hypothesized model and the χ 2 -value of the null model (Tucker and Lewis, 1973). Both CFI and TLI range from 0 to 1 with larger values indicating better model fit. Hu and Bentler (1999) suggested that acceptable model fit requires RMSEA < 0.06 and both CFI and TLI greater than 0.90. We also provide the χ 2 -test of model fit for the baseline model, which tests the null hypothesis that all slope parameters in the structural part of the model are 0 and the factor loadings in the measurement part of the model are all 1; for good model fit we would wish to reject this null hypothesis. It is worth noting here that a standard SEM would normally report the χ 2 -test for model fit, which tests for differences between the observed and expected covariance matrices. This test is not valid for the WLS estimator, because the distributional assumptions are violated. In addition it is not appropriate for our large sample, as the probability of rejecting the null hypothesis increases with sample size (Jöreskog, 1969). In addition to these formal tests model validity is also judged on the basis of the parameter estimates; specifically whether the estimates pass the 'sense test' in that they accord with expectations from theory and previous empirical findings. We also replicate the modelling with two different data sets as a further check on the robustness of our results.

Data
Our main data come from the first wave of the UK Household Longitudinal Study (UKHLS), which is otherwise known as 'Understanding society'; a nationally representative survey of approximately 40 000 households (University of Essex, 2012). Data are obtained from face-toface interviews with all adults in each household and cover various topics including personal background, economic circumstances, family relationships, health and wellbeing, as well as expectations, aspirations and opinions on a variety of issues. Wave 1 interviews were carried out in 2009-2010 and include a module on environmental attitudes and behaviours. Our analysis sample is restricted to people who commute to work regularly in England. It is necessary to restrict the sample to England because we cannot obtain comparable area level transport data (see below) for Scotland, Wales and Northern Ireland. We also restrict the sample to respondents who live in an urban area (as defined by the Office for National Statistics classification), have access to a car and who commute for up to 120 min each way by car or public transport. These restrictions ensure that it is reasonable to assume that the respondents have some choice over their commuting mode. The resulting sample size is n = 13 139; 6883 women and 6256 men (Table 2). This is contrasted with the relatively small bespoke data sets that have been used in previous HCM studies; for example Johansson et al. (2006) analysed 811 responses to a postal survey on mode choice for one specific route in Sweden and Yáñez et al. (2010) used data for 303 individuals working at university campuses in Santiago, Chile. Several advantages follow from our use of household survey data. Firstly, larger sample sizes give us more statistical power to detect effects; this is particularly valuable for the factor analysis, resulting in more stable estimates (MacCallum et al., 1996). Secondly, our results are generalizable to the population of commuters in England and not specific to a particular journey setting; thirdly, the data contain a rich set of individual and household characteristics for use as control variables. Finally, both the potential endogeneity between attitudes and choices, and the influence of focusing effects are minimized because the UKHLS is a general household survey, rather than a survey that is focused on commuting or the environment; the questions on mode choice and those on environmental attitudes and behaviours occur in separate sections of the survey with no apparent links between the two. Focusing effects mean that questions can elicit misleading responses; it is highly likely that, when people are asked about their environmental attitudes and commuting choices in a survey that is designed to explore the link between the two, they will overstate the importance of the influence of the environment and offer consistent answers in an effort to rationalize their behaviours. Further, individuals' attitudes can be affected by their mode choices since they may modify their attitudes to reduce the cognitive dissonance arising from inconsistent attitudes and behaviours; attitudes can be altered ex post whereas behaviours cannot be. If this was so then the latent construct for environmentalism would be endogenously determined, but this is unlikely in our data because of the nature of the household survey. It is also worth stressing here that, in one respect, endogeneity is an inherent part of the modelling framework that we use. The HCM framework explicitly recognizes that both behaviour (mode choice) and attitudes (the responses to the indicator questions) are driven by the same underlying latent variable (that we term environmentalism).
These advantages come with one shortcoming; whereas we have extremely rich information about individuals and households, we have only limited information about the journeys in question (for example we have time and mode but not monetary cost) and in particular we do not have information about the characteristics of the mode that is not chosen; so for example if someone chooses to commute by car we do not know how long that specific journey would take by public transport. Given that journey time and cost are key variables in any mode choice model, we overcome this by matching in area level data on local transport context to construct proxies for journey time and cost; specifically we include measures of the availability of local public transport and the amount of local traffic congestion (see below).
A list of all variables with definitions is provided in Table 1. Our choice outcome variable is usual commuting mode for the regular journey to work; a binary variable where 1 represents car and 0 represents public transport. We also have an average one-way travel time for this journey, in minutes, which we include in some of our models. Our indicator variables Y q s are a set of responses to questions on environmental behaviours and attitudes. There are 10 questions on behaviours, which ask things like 'do you leave the television on standby overnight?', 'do you wear extra clothes rather than turning the heating up?' and 'do you buy recycled products?'. The responses reflect frequency of engaging in that behaviour and most of the questions have a fivepoint scale that ranges from 'never' to 'always'; these indicators are all coded to be increasing in environmentalism. These questions were selected for inclusion in the UKHLS because they cover '... several issues which, collectively, influence a considerable proportion of greenhouse gas emissions and other resource use resulting from individual activity' (Lynn and Longhi (2011), page 2). Longhi (2013), for example, used these data and found that women have higher proenvironmental behaviour than men, and having a university degree also has a positive correlation with proenvironmental behaviour. MacPherson and Lange (2013) used one of the indicators, exploring the determinants of green electricity tariff uptake in the UK. Thomas et al. (2016) again used only one indicator to evaluate the effects of the introduction charges for carrier bags in Wales on own bag use. There are also 12 questions on environmental attitudes. The majority of these ask whether the respondent agrees or disagrees with statements like 'climate change is beyond our control', 'it's not worth me doing things to help the environment if others do not do the same' and 'any changes I make to help the environment need to fit in with my lifestyle'. Two of the attitudes questions have ordinal responses: 'How would you best describe your current lifestyle?' has a five-point scale, where 1 represents not really doing anything sympathetic to the environment and 5 represents being sympathetic to the environment in everything that they do; 'being green is an alternative lifestyle', has a four-point scale where 1 represents disagree strongly and 4 represents agree strongly. These indicators are all coded to be increasing in environmentalism. The attitudes questions were chosen for inclusion in the UKHLS to replicate, as far as a possible, questions that were used in the 2007 and 2009 Department for Environment, Food and Rural Affairs surveys of public attitudes to the environment (Department for Environment, Food and Rural Affairs, 2007; Thornton et al., 2010). The Department for Environment, Food and Rural Affairs largely chose questions from the revised 'New ecological paradigm' battery of statements. (Dunlap et al., 2000). The 'New ecological paradigm' was developed to tap into ': : : primitive beliefs about the nature of the earth and humanity's relationship with it' (Dunlap et al. (2000), page 427) and has become a widely used measure of proenvironmental orientation. The developers have carried out extensive validity and reliability testing, and concluded that the items can be treated as an internally consistent summary rating scale which strongly discriminates between known environmentalists and the general public.
To capture information on local transport context we use geographical identifiers that are available under the special licence for the 'Understanding society' survey; these show the local authority that each household is in. We use these identifiers to match in information about local transport conditions provided by the Department for Transport. We derive two variables from the information that is available. The first is the average traffic speed during the rush hour, which is a proxy for the amount of traffic congestion in the local area. The second is the journey time to the nearest town centre by car relative to public transport; this is a proxy for the availability of public transport locally. We would expect the utility that is derived from choosing the car for commuting, relative to public transport, to be higher if there is less congestion and lower the better the availability of public transport. Although we do not include monetary cost in our model because of a lack of available data, these relative journey time variables can be considered as proxy variables for the economic concept of opportunity cost, or cost in terms of time.
Other control variables include age (in years), household income, highest educational attainment, whether or not the household contains children (in various age groups), self-reported health and marital status. All estimation is carried out for men and women separately given the evidence from previous literature that men and women differ in their commuting behaviour (Roberts et al., 2011) and their environmental beliefs (Anable et al., 2006); thus we expect sex differences to impact significantly on the overall HCM once other explanatory variables have been taken into account.
Given the novelty of using household survey data to estimate an HCM, replication is an important step in the model validation process. All of our modelling is replicated by using the BHPS. The BHPS was an annual longitudinal household panel that ran from 1991 to 2008 and had a very similar design to the UKHLS, with a similar set of interview questions, including, in 2008, a module on environmental attitudes and behaviours. The main difference between the two data sets is that the BHPS is smaller; in total 13 454 face-to-face interviews were carried out in 2008, and our analysis sample (given the selection described above) is 830 women and 900 men.

Results
Descriptive statistics are presented in Table 3. 70% of the women in our sample commute by car, and 73% of men; men's average one-way journey time is slightly longer at 28 min compared with just under 24 min for women. Highest educational achievement is similar for both sexes, as is household income; household incomes are highly skewed and are used in log-equivalized form in the models below. 70% of women are married or living as a couple compared with 76% of men. Average health for both men and women is 3.65 on a scale where 1 is poor and 5 is excellent. Average traffic speed on main roads during rush hour is just over 24 m.p.h.; given that the maximum speed on these roads is between 40 and 70 m.p.h., the rush hour averages are relatively slow and represent high levels of congestion; the range is wide from 9.4 to 39.2 m.p.h. Public transport quality (measured as the time that it takes to travel to the nearest town centre by car relative to public transport) suggests that it is on average three times faster by car, with a range between two and seven times faster; this reflects extremely variable public transport quality across the local authorities.
Men and women appear to be very similar in terms of their environmental behaviours; this may be because many of the behaviours are determined at the household level and the majority of our sample is living as a couple. The biggest difference is in taking own bags for shopping, which women are more likely to do than men. Switching lights off in empty rooms is very common, as is taking own bags for shopping, not leaving the television on standby overnight and separating rubbish for recycling. In contrast not buying goods with excessive packing, having green energy or a green tariff and taking fewer flights where possible are much less prevalent behaviours. There are more differences between the sexes in attitudes than behaviours. Women are more likely to think that the world is on course for environmental disaster. However, they are also more likely to think that the environmental crisis has been exaggerated, that it is not worth doing anything about climate change unless others do the same, and similarly that it is not worth the UK doing anything. It is not common for men and women to think that they lead an environmentally sympathetic life; the average score is around 2.6, on a scale where 1 represents not really doing anything environmentally sympathetic and 5 represents being environmentally sympathetic in everything that they do. The majority of men and women think that climate change is beyond our control, and its effects are too far in the future to worry about. However, over 60% believe that climate change will affect the UK in the next 30 years. Generally environmental attitudes show a large degree of confusion and inconsistency, which accords with the findings of the review work that was carried out for the Department for Transport (Anable et al., 2006); it also contributes to the observed inconsistencies between attitudes and behaviour. We have carried out both EFA and CFA on our 22 observed indicator variables to explore and verify the latent structure of the data. In the EFA the indicators are allowed to load freely, and the appropriate number of factors is chosen by looking at several diagnostic statistics, including the eigenvalues for each factor, scree plots (Cattell, 1966) and χ 2 -tests. The eigenvalue for a given factor reflects the variance in all the variables, which is accounted for by that factor. The Kaiser-Guttman criteria (Kaiser, 1960;Guttman, 1954) recommend retaining factors with an eigenvalue greater than 1. For both men and women, all these statistics suggest that two or three factors are superior to a one-factor model. In χ 2 -tests the null hypothesis of one factor is rejected is favour of the alternative hypothesis of two factors, and a null of two factors is rejected in favour of three. Comparing the factor loadings in the two-and three-factor models, the two-factor model has a 'cleaner' structure where both factors have very distinct loadings, with all except one of the attitude indicators loading on the first factor and all the behaviour indicators loading on the second factor. This can be justified on the basis of psychological theory that treats attitudes and behaviours as separate constructs both generally (Ajzen and Fishbein, 2005) and in the context of environmental decision making (Stern, 2000). Thus, taking account of all this information, a two-factor model is preferred.
Given that EFA suggests that it is reasonable to view our indicators for attitudes and behaviours as two separate constructs, CFA is then employed as the first step in the estimation of the measurement model in the SEM and here the indicators are forced to load onto their two respective factors. The results are shown in Table 4; the second and third columns show the factor loadings for environmental behaviours and the last two columns for attitudes (these are  (4). All estimates are significant at p < 0:0001. ‡Loading fixed at 1 via normalization. the α-coefficients from equation (4)); in both cases the results are presented in descending order of the factor loadings for women. The factor loadings show how each indicator is associated with the underlying latent construct. For behaviours the indicators are normalized so that the loading on 'not leaving the television on standby overnight', TV, is set to 1. For women all other indicators have a higher loading than TV; 'not buying goods with excess packaging', packing, has the largest loading onto the behaviours factor, followed by 'buying recycled products', produce, and 'taking fewer flights where possible', flights; the lowest loadings are for 'having green energy or tariff', energy, 'separating rubbish for recycling', recycling. The ranking of loadings onto behaviour is very similar for men. For attitudes the loading on belief that they 'lead an environmentally friendly life', own life, is set to 1 to normalize the scale. For both women and men the largest loadings are for 'it's not worth Britain trying to do anything about climate change', Britain, 'it's not worth doing anything unless others do the same', others, and 'the effects of climate change are too far in the future', future, and the lowest loadings are for own life, 'being green is an alternative lifestyle', alternative, and 'any changes I make have to fit in with my current lifestyle', lifestyle.
The results for the latent variable model are presented in Table 5; these show the associations between the two latent constructs and observable individual characteristics (the γ-coefficients from equation (2)). As is common in the SEM literature, standardized coefficients are reported for continuous variables, as these allow a comparison of the relative size of the effects within models. The standardized coefficients are β Å = βσ x /σ q , where σ x and σ q are the standard deviations of the continuous explanatory variable x and dependent variable q. Non-standardized  (2). Standardized coefficients (γ Å = γσ x =σ q ) are reported for continuous variables and non-standardized for dichotomous and ordinal variables. Standard errors for the standardized coefficients are obtained by the delta method (see Davidson and MacKinnon (2004), section 5.6). ‡Significance at p < 0:001. §Significance at p < 0:05. § §Significance at p < 0:1. coefficients are reported for dichotomous and ordinal variables, and these show the estimated change in the dependent variable for a discrete unit change in the explanatory variable. p-values for coefficient estimates are calculated under the assumption of asymptotic normality, where the asymptotic properties of the WLS estimator are discussed in Muthén and Satorra (1995). For both men and women proenvironmental behaviours and attitudes are non-linearly related to age. Having children does not seem to matter for behaviours for men or women; but for both men and women it seems that having primary-school-age children means that it is less likely that you will have proenvironmental attitudes. Behaviours and attitudes are increasing in education for both men and women. Income has a negative association with proenvironmental behaviours for men and women, but it has a positive association with proenvironmental attitudes. Being married and having better health have a positive association with behaviours for both men and women but have no effect on attitudes. Table 6 shows the results for the mode choice model where the dependent variable is a dichotomous choice between commuting by car (d i = 1) and by public transport (d i = 0) ( Table 6 reports the β-coefficients from equation (1)). Columns 1(a) and 1(b) present the results from a standard DCM, where the latent variables for greenness have been omitted. Columns 2(a) and 2(b) include the two latent variables, and columns 3(a) and 3(b) also include commuting time as an additional regressor; again standardized coefficient estimates are reported for continuous variables. Firstly, we see that omitting the latent variables makes virtually no difference to the coefficient estimates on the variables included. However, when included the behaviours latent variable is significant for both men and women and the attitudes latent variable is significant for men. In addition a Wald test for joint significance of the two latent variables shows them to be jointly significant (p = 0:000) in all models in which they are included. This suggests little collinearity between the latent variables and the other explanatory variables and that the former are independently important in explaining mode choice. The pseudo-R 2 statistic (McKelvey and Zavoina, 1975) and predictive ability as shown by the proportion of correctly predicted cases also suggest the superiority of models including the latent variables.
In general the results are very similar whether or not commute time is included (models 2(a) and 2(b) versus 3(a) and 3(b)). In terms of the latent variables, having a latent tendency to proenvironmental behaviours in other areas of life has a negative effect on the probability of commuting by car. Similarly, latent proenvironmental attitudes have a negative effect on the probability of car commutes for men but no significant effect for women. In terms of the conditioning variables, there is a similar non-linear age effect for men and women. Having preschool children means that women are more likely to commute by car, but the effect is not significant for men. The probability of commuting by car is increasing in education for women but this is not significant for men. Household income has a negative effect for men. Married women are less likely to commute by car but married men are more likely to. Health has no effect; however, it is worth stressing that this is a relatively healthy sample because by definition all respondents are working. The quality of public transport in the local area and the average traffic speed have the expected signs and are significant for both men and women; the better the public transport the less likely people are to commute by car and the higher the average traffic speed during the rush hour (i.e. the less congestion) the more likely.
The quantitative interpretation for the standardized coefficients in Table 6 is that, for any coefficient estimateβ, a 1-standard-deviation change in the associated continuous explanatory variable results in aβ-standard-deviations change in the underlying latent dependent variable (the utility derived from choosing to commute by car). Hence, the standardized coefficients on the continuous latent explanatory variables can be compared straightforwardly with those for other continuous variables. Here we see that for men the effects of environmental attitudes and  (1). Standardized coefficients (β Å = βσ x =σ y ) are reported for continuous variables and non-standardized for dichotomous and ordinal variables. Standard errors for the standardized coefficients are obtained by the delta method (see Davidson and MacKinnon (2004), section 5.6). Pseudo-R 2 is McKelvey and Zavoina's (1975). The correct predictions variable is the proportion of correctly predicted cases. Chi-sq-H 0 : all slope parameters in the structural part of the model are 0, and the factor loadings in the measurement part of the model are all 1. CFI, comparative fit index. TLI, Tucker-Lewis reliability index; RMSEA, root-mean-square error of approximation. ‡Significance at p < 0:001. §Significance at p < 0:1. § §Not applicable.
behaviours are very similar in size to the effects of local public transport quality and average traffic speeds. For example, increasing proenvironmental attitudes by 1 standard deviation reduces the utility that is derived from car use by 0.109 standard deviations; this is almost identical to the increase in utility that arises from a 1-standard-deviation increase in local traffic speeds. Similarly for women, having a latent tendency to undertaking environmental behaviours in other areas of life has a similar effect on reducing the utility from car use to that from having better local public transport, or more road congestion. It is worth pointing out here that the relative sizes of the standardized coefficients for the latent versus transport variables could be affected by possible errors-in-variables problems for the latter, due to the use of local authority areawide variables to represent the effects for individuals. This would probably bias the coefficients of the area level transport variables downwards, which may mean that the latent variables have a smaller relative effect. However, it is also possible that the exclusion of individual travel cost variables could cause the travel time variables to be biased in the opposite direction.
In the second pair of models where the respondents' usual commute time is included, this has a negative effect as expected, i.e. the longer your commute the less likely you are to use a car. Including commuting time means that household income is now significant and positive for women (it remains negative for men). This is unsurprising because there is a close positive correlation between household income and commute time for men in particular; this correlation is because the rational decision maker will choose to commute for longer only if they are compensated, and part of this compensation comes from the labour market in the form of higher wages. However, it is also likely that commute time is endogenous in this model, not least because there is a two-way relationship between length of commute and mode, and also because there may be a set of unobserved factors which influence both mode and time. One such factor is 'trip chaining', which arises where individuals make multiple stops on their commute, e.g. to take children to school or to pick up shopping; this information is not available in our data. Nevertheless the fact that inclusion of commute time does not substantively change our estimates of the relative importance of environmental behaviours and attitudes is a strong robustness check on our results.
Model fit statistics for the SEM are reported in the lower part of Table 6, and these are all supportive of our model specification. We can reject the null hypothesis of the χ 2 -test that all slope parameters in the structural part of the model are 0, and the factor loadings in the measurement part of the model are all 1. The CFIs are all above (or very close to) the recommended cut-off of 0.9; similarly the TLIs are all very close to (but just below) 0.9. In addition the RMSEAs for all four models are below 0.06.
For conciseness we do not report the results of estimating these models with our alternative data set (the BHPS) here. In summary the story is essentially the same, although the smaller sample sizes result in larger standard errors. The factor analysis suggests two latent factors, and both of these (environmental attitudes and behaviours) are significant in determining mode choice for both men and women; this is slightly different from the UKHLS results, where only behaviours are significant for mode choice for women. As for the UKHLS, the BHPS estimates suggest that proenvironmental attitudes and behaviours reduce the probability of commuting by car; quantitatively these effects are larger in the BHPS data, and the effects of quality of local public transport and congestion are smaller. This replication is an important check on the robustness of our estimates.

Discussion
Some important findings emerge from the estimation of our HCM for commuting mode choice. Firstly, we have shown that it is possible to use large secondary data sets for HCM estimation. This increases the generalizability and statistical reliability of our results compared with existing studies that rely on relatively small bespoke surveys, which are prone to selection problems and focusing effects, and include little information on individual characteristics with which to control for heterogeneity. Secondly, the factor analysis suggests that the indicator variables are representative of two latent constructs: environmental attitudes and behaviours. These attitudes and behaviours appear to be separable constructs and the latent variable model shows, for example, that whereas higher levels of education are associated with both more proenvironmental attitudes and behaviours, in contrast increased income has a positive association with attitudes but a negative association with behaviours. These results, in the context of the psychological ABC model of environmental decision making, suggest different antecedents for attitudes and behaviours. Behaviours are much more likely to be influenced by personal context and convenience than attitudes; this may explain the diverse income effects and also the fact that marital status and health are significant predictors of environmental behaviours but do not affect attitudes.
Thirdly, environmental attitudes and behaviours are significantly related to the choice of commuting mode. For men, the more proenvironmental their attitudes and other lifestyle behaviours the less likely they are to use a car for the regular commuting journey. This result contrasts with the previous literature that has argued that attitudes will have little effect on high constraint environmental behaviours like car driving (Collins and Chambers, 2005). We cannot completely discount the possibility that mode choice behaviour is driving attitudes here, in that men who use public transport for their daily commute see themselves as environmentally sympathic and hence change their attitudes to align with this. However, the nature of our household survey data and the fact that the environmental questions are not directly related to commuting, or asked in the same survey module, reduce this possibility compared with the bespoke survey data that are normally used to estimate HCMs. For women other environmental behaviours are again significant, but in contrast attitudes have no significant effect. This may be because women's commuting choices are more constrained than men's, as evidenced by the fact that having preschool-age children significantly increases the probability that women will use a car for commutes but this is not significant for men. Previous literature has shown that women have more complex journeys to work than men and are engaged in more trip chaining resulting in non-direct home-to-work journeys (Hensher and Reyes, 2000).
Finally, our results are supportive of the ABC model of environmental decision making. Attitudes and behaviours influence the utility that an individual derives from different mode choices for the regular commute. Thus the commuting mode choice is an interactive product of 'internal' attitudes and 'external' contextual factors.

Conclusion
Persuading people to ignore their cars and to use alternative, more sustainable forms of travel is essential if governments are to achieve their ambitious climate change goals. However, in the UK, as in many other countries around the world, our attachment to the car persists. This paper has contributed to furthering our understanding of the way that people make travel choices, specifically what determines choice of mode for the regular commuting journey. Traditionally transport economics have focused on time and cost, assuming that these are the main determinants of travel choices for the rational economic agent. HCMs have allowed us to integrate latent variables, reflecting underlying environmental attitudes and behaviours, into a model of mode choice; these variables are shown to be significant and their effects are similar in size to important contextual factors like the availability of public transport.
Integrating these latent variables into the mode choice model has facilitated a more sophisticated understanding of the decision-making process. This is reflective of a more general acceptance of behavioural economics, which diverges from the narrow view of economic rationality, and incorporates psychological factors into models of individual decision making. Unusually, for applications of HCMs, we have used large nationally representative household survey data sets for model estimation, thereby increasing the generalizability and statistical reliability of our results.
The fact that psychological factors influence commuting mode choices can be exploited by policy makers who need to persuade us to make more environmentally sympathic choices.
Attempting to influence our attitudes towards the environment (e.g. via advertising campaigns or provision of information) or our other environmental behaviours (e.g. by making recycling a convenient activity for households) are not substitutes for fiscal tools and regulation but they can be seen as part of a comprehensive policy toolbox, which is targeted at making our travel choices more sustainable. A similar toolbox has been used successfully in the UK, and other countries, to reduce smoking behaviour substantially (Bauld, 2011). As well as climate change, private car use also contributes to congestion, noise, poor air quality, road traffic accidents and low levels of physical activity; so there are many reasons to try to bring about a change in individual behaviours.