diff --git a/thesis/Appendices/AppendixA.qmd b/thesis/Appendices/AppendixA.qmd index 734d10d..b44e684 100644 --- a/thesis/Appendices/AppendixA.qmd +++ b/thesis/Appendices/AppendixA.qmd @@ -9,56 +9,83 @@ Instead, demographers use period (or "current") life tables, which consider what Life tables can be constructed using discrete age bands starting at age $x$ and ending at age $x+n$. We supply the age-specific death rates, ${}_{n}m_{x}$, and the average person-years lived by those dying in the interval, ${}_{n}a_{x}$, and the life table calculates the mean age at death – the life expectancy, $e_x$. -We start with a hypothetical cohort of size $l_0 = 100,000$ and sequentially apply the probability of dying in each age group, calculated as +The probability of dying, ${}_{n}q_{x}$, is defined as the ratio of the number of people who died in the age interval, ${}_{n}d_{x}$, to the number who survived to age $x$, $l_x$: +$$ +{}_{n}q_{x} = \frac{{}_{n}d_{x}}{l_x}. +$$ {#eq-app-a-prob-dying-deaths} + +The age-specific death rate is defined as the ratio of the number of people who died in the age interval to the total number of person-years lived, ${}_{n}L_{x}$, which is the weighted sum of the number of person-years lived ($n$) by those who survived, which, in turn, is the difference between those who survived to age $x$ and those who died in the interval ($l_x - {}_{n}d_{x}$), and the number of person-years lived on average (${}_{n}a_{x}$) by those who died (${}_{n}d_{x}$): +$$ +{}_{n}m_{x} = \frac{{}_{n}d_{x}}{n \cdot (l_x - {}_{n}d_{x}) + {}_{n}a_{x} \cdot {}_{n}d_{x}}. +$$ {#eq-app-a-death-rate} +We assume the denominator of @eq-app-a-death-rate can be approximated by the mid-year population, ${}_{n}P_{x}$, which leads us to recover the expression for the cross-sectional, empirical death rate in @eq-death-rate. +By rearranging the denominator to make the number of survivors the subject, we obtain +$$ +l_x = \frac{1}{n} \left({}_{n}P_{x} + (n - {}_{n}a_{x} \cdot {}_{n}d_{x})\right). +$$ {#eq-app-a-survivors} +We can substitute this expression into @eq-app-a-prob-dying-deaths and divide by ${}_{n}P_{x}$ to obtain $$ {}_{n}q_{x} = \frac{n \cdot {}_{n}m_{x}}{1 + (n - {}_{n}a_{x}) {}_{n}m_{x}}. $$ {#eq-app-a-prob-dying} -The open interval ${}_{\infty}q_{x} = 1$, as nobody is immortal. -Using the probability of surviving in each age group, ${}_{n}p_{x} = 1 - {}_{n}q_{x}$, the number of survivors is given by +This expression, although unintuitive, allows us to convert from ${}_{n}m_{x}$ to ${}_{n}q_{x}$ with only the parameter ${}_{n}a_{x}$. + +In the period life table, we start with a hypothetical cohort of size $l_0 = 100,000$ and sequentially apply the probability of surviving in each age group, ${}_{n}p_{x} = 1 - {}_{n}q_{x}$, to calculate the number of survivors as $$ l_{x+n} = l_x \cdot {}_{n}p_{x}. $$ {#eq-app-a-life-table-1} -The number of person-years lived is the sum of the number of survivors weighted by the band width and number of people who died weighted by ${}_{n}a_{x}$ +The number of person-years lived is the sum of the number of survivors weighted by the band width and number of people who died (${}_{n}d_{x} = l_{x} \cdot {}_{n}q_{x}$) weighted by ${}_{n}a_{x}$ $$ -{}_{n}L_{x} = n \cdot l_x + {}_{n}a_{x} \cdot l_{x} \cdot {}_{n}q_{x} \quad {}_{\infty}L_{x} = \frac{l_x}{{}_{\infty}m_{x}}, +{}_{n}L_{x} = n \cdot l_x + {}_{n}a_{x} \cdot l_{x} \cdot {}_{n}q_{x}. $$ {#eq-app-a-life-table-2} -and the total number of person-years lived above $x$ is +The open interval ${}_{\infty}q_{x} = 1$, as nobody is immortal. +Using @eq-app-a-prob-dying-deaths, it follows that the number of deaths in this interval is equal to the number who survived to the final age group, i.e. ${{}_\infty}d_{x} = l_x$. +Since the death rate from @eq-app-a-death-rate can be rewritten using the number of person-years lived, ${}_{n}L_{x}$, as the denominator and we can substitute the number of deaths with the number surviving to the final age group, we can obtain an expression for the number of person-years lived in the open-ended age interval +$$ +{}_{\infty}L_{x} = \frac{{}_{\infty}d_{x}}{{}_{\infty}m_{x}} = \frac{l_x}{{}_{\infty}m_{x}}. +$$ {#eq-app-a-life-table-close} + +The total number of person-years lived above $x$ is $$ T_{x} = \sum^{\infty}_{x = a} {}_{n}L_{x}. -$$ {#eq-app-a-life-table-2} +$$ {#eq-app-a-life-table-3} Then, life expectancy is given by dividing the number of person-years lived by the number of people who will live them $$ e_x = \frac{T_x}{l_x}. -$$ {#eq-app-a-life-table-2} +$$ {#eq-app-a-life-table-4} Throughout the thesis, I only consider life expectancy at birth. ### The very young ages and the very old ages -On average, it is a good approximation to assume deaths occur halfway through the age interval: ${}_{n}a_{x} = n /2$. +On average, it is a good approximation to assume deaths occur halfway through the age interval: ${}_{n}a_{x} = n / 2$. But for younger ages, particularly at lower levels of mortality, the majority of infant deaths lie further towards the earliest stages of infancy. Coale and Demeny used regression on a series of international datasets to recommend suitable values for ${}_{1}a_{0}$ and ${}_{4}a_{1}$ instead of the midpoint [@coaleRegionalModelLife1983]. -The start of the open age group can be many years away from some of the ages at death, particularly in ageing populations. -In order to produce reliable estimates of death rates at high ages, I used the Kannisto-­Thatcher method to expand the terminal age group ($\geq 85$ years) of the life table and adjust ${}_{n}a_{x}$ above 70 years [@thatcherSurvivorRatioMethod2002]. +The start of the open-ended age group can be many years away from some of the ages at death, particularly in ageing populations. +In order to produce reliable estimates of death rates at older ages, I used the Kannisto-­Thatcher method to expand the terminal age group ($\geq 85$ years) of the life table and adjust ${}_{n}a_{x}$ above 70 years [@thatcherSurvivorRatioMethod2002]. +The Kannisto-Thatcher method assumes the probability of dying is a logistic function of age. +The logit-transformed probability of dying above 70 years is regressed upon age. +The resulting curve is extrapolated through to 129 years before calculating the number of survivors in the cohort following the adjusted probability of dying to estimate ${}_{n}a_{x}$ above 70 years. ## Probability of dying The probability of dying from a specific cause of death, $i$, is calculated as in @eq-app-a-prob-dying. -Equally, we can subtract the probability of surviving to that age group, $1 - \prod_x {}_{n}p^i_{x}$. +Equally, we can calculate the probability of dying by subtracting the probability of surviving in each age group through to that age from unity, i.e. $1 - \prod_x {}_{n}p^i_{x}$. Note, even for the smallest death rates, ${}_{\infty}q^i_{x} = 1$ – if you live to infinity, you'll die of it eventually. ## Cause-specific decomposition of differences in life expectancy -@arriagaMeasuringExplainingChange1984 proposed a method to calculate the age-specific contributions to the difference in life expectancy between two populations as +Using quantities generated from the life tables of two populations as above, @arriagaMeasuringExplainingChange1984 proposed a method to calculate the age-specific contributions to the difference in life expectancy between these populations as $$ {}_{n}\Delta_{x} = \frac{l^1_x}{l^1_0} \left( \frac{{}_{n}L^2_{x}}{l^2_x} - \frac{{}_{n}L^1_{x}}{l^1_x} \right) + \frac{T^2_{x+n}}{l^1_0} \left( \frac{l^1_x}{l^2_x} - \frac{l^1_{x+n}}{l^2_{x+n}} \right). $$ {#eq-app-a-arriaga-age} +The first term on the right hand side corresponds to the "direct effect" on the life expectancy difference between the two populations in the average number of person-years lived by the survivors to that age group (${}_{n}L_{x} / l_x$). +The second term represents the "indirect effect" on the number of survivors caused by the mortality changes within an age group. -We then assume the age- and cause-specific contributions are proportional to the difference in cause-specific death rates: +We then assume the age- and cause-specific contributions are proportional to the difference in cause-specific death rates between the two populations: $$ {}_{n}\Delta^i_{x} = {}_{n}\Delta_{x} \cdot \frac{{}_{n}m^i_{x}(2) - {}_{n}m^i_{x}(1)}{{}_{n}m_{x}(2) - {}_{n}m_{x}(1)} $$ {#eq-app-a-arriaga-cause} diff --git a/thesis/Chapters/Chapter2.qmd b/thesis/Chapters/Chapter2.qmd index d9aa059..0ffb020 100644 --- a/thesis/Chapters/Chapter2.qmd +++ b/thesis/Chapters/Chapter2.qmd @@ -22,9 +22,9 @@ To overcome these issues, we can use statistical smoothing techniques to obtain In small-area studies, it is common to smooth data using models with explicit spatial dependence, which are designed to give more weight to nearby areas than those further away. There are three main categories for modelling spatial effects. First, we can treat space as a continuous surface using Gaussian processes or splines. -Second, we can use areal models, which make use of the spatial neighbourhood structure of the units. -Third, we can build models that exploit a nested hierarchy of geographical units, for example between state, county and census tract in the US. -Each of these methods rely on assumptions which may make them more or less appropriate in different applications. +Second, we can use hierarchical models for areal data, which make use of the spatial neighbourhood structure of the units. +Third, we can again use hierarchical models for areal data but instead we can exploit a nested hierarchy of geographical units, for example between state, county and census tract in the US. +Each of these methods, which can be used separately or in combination if the context of the problem allows, rely on assumptions which may make them more or less appropriate in different applications. #### Space as a continuous process {-} @@ -105,6 +105,7 @@ There might be true variability in the data which a smoothing model would concea For example, certain spatial units might contain isolated populations with high mortality over a sustained period, such as counties with Native American reservations in the USA [@dwyer-lindgrenInequalitiesLifeExpectancy2017]. There can also be spatially- and temporally-specific events that cause a spike in mortality such as the Grenfell Tower fire in 2017. Without accounting for these events, the models described above would either attenuate their effect on mortality, or a spike in deaths would cause estimates of mortality in nearby spatial units or years to be erroneously high. +Beyond the use of subject matter experts, posterior predictive checks and plots of modelled death rates against the observed data can help to identify outlier spikes in mortality which are specific to a particular time or place, and which we do not want our model to smooth. ### Applications of disease mapping methods @@ -116,6 +117,8 @@ Directly standardised methods, in contrast, require knowledge of the full age st Age-standardised death rates, however, suffer the same interpretability issue as the standardised mortality ratio, and are only comparable between studies if the same reference population is used. An alternative choice is _life expectancy_. @silcocksLifeExpectancySummary2001 explain that life expectancy is a "more intuitive and immediate measure of the mortality experience of a population, [and] is likely to have greater impact... than other measures that are incomprehensible to most people." +However, although the metric appears more interpretable, life expectancy at birth constructed from a period life table is often misinterpreted as the mean length of life of the cohort into which the newborn is born. +In fact, it measures the expectation of life assuming that the newborn will be exposed to age-specific mortality conditions throughout their life that are exactly the same as the current population. The estimation of death rates requires two data sources: deaths counts and populations. Modern death registration systems, such as that of the UK, are almost entirely complete and accurate. @@ -256,7 +259,7 @@ In 2015, the GBD study released its first subnational estimates of mortality, st @steelChangesHealthCountries2018 assessed these data, which divided the UK into 150 regions, finding mortality from all-causes varied twofold across the country, with the highest years of life lost in Blackpool and the lowest in Wokingham. In a study on forecasting subnational life expectancy in England and Wales, @bennettFutureLifeExpectancy2015 estimated a 8.2 year range in life expectancy for men and 7.1 year range for women in 2012 between 375 districts. The lowest life expectancies were seen in urban northern England, and the highest in the south and London's affluent districts. -Within London itself, male and female life expectancy showed 5-6 years of variation. +Within London itself, @cheshireFeaturedGraphicLives2012 visualised the heterogeneity of mortality in London by assigning tube stops the life expectancy of the nearest ward, revealing that 10 years are lost between two consecutive stops, Canary Wharf and North Greenwich, on the Jubilee line. #### Deprivation {-} diff --git a/thesis/Chapters/Chapter4.qmd b/thesis/Chapters/Chapter4.qmd index f7cd6aa..2c99739 100644 --- a/thesis/Chapters/Chapter4.qmd +++ b/thesis/Chapters/Chapter4.qmd @@ -80,16 +80,33 @@ For the MSOA analysis, MSOAs were nested in districts, which were, in turn, nest For the LSOA analysis, LSOAs were nested in MSOAs, which were nested in districts. The terms for the largest spatial unit were centred on zero to allow the spatial effects to be identifiable. -All standard deviation parameters of the random effects had $\sigma \sim \mathcal{U}(0, 2)$ priors. +All standard deviation parameters of the random effects had $\sigma \sim \mathcal{U}(0, 2)$ priors, which were used for a previous mortality modelling study by the group [@bennettFutureLifeExpectancy2015]. +I performed a sensitivity analysis using the less informative $\sigma \sim \mathcal{U}(0, 100)$ prior, to which the model was robust (the largest inferred standard deviation parameter was for the age group intercept with a mean around 0.9). For the global intercept and slope, we used the diffuse prior $\mathcal{N}(0, \sigma^2=10^5)$. The overdispersion parameter $r$ had the prior $\mathcal{U}(0, 50)$. @tbl-ap-ch4-model shows all model parameters, their priors and dimensions for the MSOA-level model in @sec-Chapter5. +@tbl-ch-4-checks summarises the model adequacy and consistency checks performed for the analyses. + +| Type of check | Checks performed | +| -------------------------- | ------------------------------------------------ | +| Model adequacy | Check all posterior death rates are between 0 and 1; scatter plots of posterior predictions of death rates against observed data by age group and year; inspection of residuals by age group and year | +| Model bias | Compare aggregated posterior predictions of deaths with uncertainty to national number of deaths from data each year; evaluate model shrinkage by inspecting the range of life expectancy between the top and bottom percentiles (aggregating 67 or 68 MSOAs) in 2002 and 2019 estimated using the model and the data | +| Consistency between models | Aggregate posterior predictions of deaths from MSOA-level model to district-level and compare to posterior predictions of deaths from the same model run at district level (and same checks for LSOA- and national-level where appropriate) | + +: Summary of model posterior checks. {#tbl-ch-4-checks} + +Although a random walk approach has been used here to model the J-shape age-mortality association, there are a number of alternatives. +For example, @gonzagaEstimatingAgeSexspecific2016 use a series of linear splines over the age dimension. +@alexanderFlexibleBayesianModel2017 describe an approach using the first three principal components of standard mortality curves, where the first component represents baseline mortality, and the second and third components allow offsets for higher child mortality and higher adult mortality. +However, both these approaches require the modeller to manually specify either the number of basis splines and position of the knots or the number of principal components required to accurately describe the age-mortality relationship. +This becomes more difficult when modelling several different diseases, which might not follow a J-shape, particularly those with a skew towards older ages such as prostate cancer. +Random walks are more flexible and require less tuning in this respect, and are also used here to model age-specific slopes over time, for which we have no such prior demographic knowledge. ## Inference The decision was made early in my PhD research to use Markov chain Monte Carlo (MCMC) sampling methods for inference, as this is the "gold standard" with guarantees that, under mild conditions, the sequence of samples will asymptotically converge to the true posterior distribution [@robertsGeneralStateSpace2004]. -Furthermore, the state-of-the-art approximate inference package for spatial models, `INLA`, scales badly with the number of hyperparameters, and hence would struggle with the high dimensionality of the models in this thesis. +Although sampling approaches are the focus here, the `R-INLA` package, which uses approximate inference for latent Gaussian fields and has implementations of common spatial models, could also have been used. Bayesian models can be specified in a probabilistic programming language. The starting point for this project was the `NIMBLE` package [@devalpineNIMBLEMCMCParticle2022; @devalpineProgrammingModelsWriting2017]. diff --git a/thesis/Chapters/Chapter5.qmd b/thesis/Chapters/Chapter5.qmd index f0dba93..6b32162 100644 --- a/thesis/Chapters/Chapter5.qmd +++ b/thesis/Chapters/Chapter5.qmd @@ -73,22 +73,27 @@ Many of the MSOAs with the highest life expectancy, especially for men, were in ![Map of life expectancy and the distribution of life expectancy in 2019. The areas in white have a life expectancy equal to the national life expectancy.](../thesis-analysis/thesis_analysis/england/figures/map_level.pdf){#fig-ch-5-map-level fig-scap="Map of life expectancy and the distribution of life expectancy in 2019."} +![Median estimates of MSOA-level life expectancy in England in 2019 against the 95% credible intervals.](../thesis-analysis/thesis_analysis/england/figures/uncertainty.pdf){#fig-ch-5-uncertainty} + +In general, as life expectancy increased, the credible interval of life expectancy increased (@fig-ch-5-uncertainty). +The widest credible interval in any sex-MSOA combination in 2019 was 10.6 years. + ### Change in life expectancy Female and male life expectancy were correlated across MSOAs with a correlation coefficient of 0.87 (@fig-ch-5-sex-comp). Female life expectancy was higher than male life expectancy in all but 15 MSOAs. The female advantage was more than 5 years in 1498 (22.1%) of 6791 MSOAs and 1–5 years in another 5187 (76.4%). From 2002 to 2019, a decline in life expectancy was more probable than an increase in 124 mostly urban MSOAs of 6791 (1.8% of all MSOAs) for women, with posterior probabilities of greater than 80% in 34 of these. -The largest estimated decline of 3.0 years (0.9–5.3; posterior probability of the estimated decline being a true decline >0.99) occurred in a MSOA in Leeds (@fig-ch-5-map-change, @fig-ch-5-map-pp-change). +The largest estimated decline of 3.0 years (0.9–5.3; posterior probability of a decline >0.99) occurred in a MSOA in Leeds (@fig-ch-5-map-change, @fig-ch-5-map-pp-change). ![Comparison of female and male life expectancy in 2019 and change from 2002 to 2019.](../thesis-analysis/thesis_analysis/england/figures/sex_comp.pdf){#fig-ch-5-sex-comp} ![Geography of change in life expectancy from 2002 to 2019.](../thesis-analysis/thesis_analysis/england/figures/map_change.pdf){#fig-ch-5-map-change} -![Map of posterior probability that the estimated change represents a true increase or decrease in life expectancy from 2002 to 2019.](../thesis-analysis/thesis_analysis/england/figures/map_change_prob.pdf){#fig-ch-5-map-pp-change} +![Map of posterior probability that the estimated change represents an increase in life expectancy from 2002 to 2019.](../thesis-analysis/thesis_analysis/england/figures/map_change_prob.pdf){#fig-ch-5-map-pp-change} Elsewhere, median posterior change was positive, ranging from less than 1 year in 408 MSOAs to more than 7 years in 63 MSOAs. -Posterior probability of an increase in male life expectancy was more probable than a decrease in all but one MSOA in Blackpool, in which life expectancy changed by –0.4 years (–2.3 to 1.6; posterior probability of being a true decline 0.64). +Posterior probability of an increase in male life expectancy was more probable than a decrease in all but one MSOA in Blackpool, in which life expectancy changed by –0.4 years (–2.3 to 1.6; posterior probability of a decline 0.64). For the other MSOAs, the increase ranged from less than 1 year in 31 MSOAs to more than 7 years in 114 MSOAs. The largest increases in female and male life expectancies were seen in some MSOAs in and around London (e.g., in the London Borough of Camden). In 5133 (75.6%) MSOAs, male life expectancy increased more than female life expectancy (@fig-ch-5-sex-comp), leading to a closing of the life expectancy gap between female and male sexes. @@ -148,6 +153,7 @@ Although MSOAs have small populations and are designed to have some socioeconomi To understand life expectancy inequalities in relation to individual socioeconomic characteristics requires linking health and other data such as census records, education, and taxes, as done in countries like New Zealand and Sweden. The people who live in each MSOA can change due to both within­-country and international migration. +There is some evidence that migrants tend to be healthier than those who do not move [@connollyIncreasingInequalitiesHealth2007]. Regression of the change in life expectancy from 2002 to 2019 in each MSOA against population turnover, the proportion of households in each MSOA in 2019 who were different from those who had lived there in 2002 [@vandijkUsingLinkedConsumer2021], was not able to explain the variation in life expectancy change for women ($R^2 < 0.001$) or for men ($R^2 = 0.01$) at the national level. Studies in both the UK [@connollyIncreasingInequalitiesHealth2007] and USA [@ezzatiReversalFortunesTrends2008] have also shown that migration is not sufficient to explain the trends in health and health inequalities, and that these trends are largely due to real changes in population health. Even if rising inequalities are partly due to health-­selective migration, this phenomenon has social and economic origins that should be addressed through employment opportunities, affordable housing, high-­quality education, and health care. @@ -163,7 +169,9 @@ The extent of this underestimation is modest; however, because a large part of l The life expectancy estimates in specific years are similar to the snapshots presented by the ONS and Public Health England [@officefornationalstatisticsHealthExpectanciesBirth2015; @publichealthenglandLocalHealthSmall2021], with correlation coefficients of 0.92-0.95 and mean differences of –0.004 to 0.19 years. However, these reports could not analyse trends because data were aggregated over 5 years (2009–13, 2013–17, or 2015–19). In terms of trends, studies that grouped small ­area units into deciles of deprivation have detected a decline in female life expectancy in the one or two most deprived deciles [@bennettContributionsDiseasesInjuries2018; @marmotMarmotReview102020]. -By analysing trends at the MSOA level, I could identify the communities in which longevity is declining and show that the decline, which began around 2010 in women in some MSOAs, has spread and accelerated since 2014. +@boulieriSpatiotemporalModelEstimate2020 modelled trends in life expectancy by district and decile of deprivation, using a space-time interaction term to detect local changes in the life expectancy trend. +Notably, they detected lower life expectancy than expected for the district of Leeds in 2 years of the study period for women and 4 years for men. +By analysing trends at the MSOA level, I could identify the specific communities in which longevity is declining and show that the decline, which began around 2010 in women in some MSOAs, has spread and accelerated since 2014. @congdonGeographicalPatternsDrugRelated2019 also used spatial models to smooth over MSOA-level data of mortality, but specifically drug-related deaths and suicides between 2012-16. The author singled out the district of Blackpool as containing many of the MSOAs with the most extreme relative risks of death from these causes, including the MSOA I found had the lowest life expectancy for men in 2019. diff --git a/thesis/Chapters/Chapter6.qmd b/thesis/Chapters/Chapter6.qmd index 0c30a5a..58a11e5 100644 --- a/thesis/Chapters/Chapter6.qmd +++ b/thesis/Chapters/Chapter6.qmd @@ -7,7 +7,7 @@ Unlike the original text, I will largely focus on the life expectancy estimates, Although the models in the previous chapter took several days of computing time, runtimes will only decrease as computing power increases. And in theory, the models can be scaled to higher and higher spatial resolutions as both hardware and inference algorithms improve. -From a computational perspective, we could potentially estimate mortality for all LSOAs, OAs, or even postcodes in England. +From a computational perspective, we could potentially estimate mortality for all LSOAs, OAs, or even postcodes^[Unlike OAs, postcodes do not align with other administrative boundaries, nor are they designed to contain similar numbers of people, which may lead to small number issues when modelling mortality.] in England. In this chapter, I test this idea by modelling life expectancy at the finer LSOA level for a single region in England, its capital city London, which has a number of LSOAs of the same order of magnitude as the number of MSOAs in England. ## Methods @@ -26,9 +26,10 @@ In these cases, the population was set equal to the number of deaths. The model was largely as outlined in @sec-Chapter4, with a few changes: First, the negative binomial likelihood from @eq-ch-4-likelihood-1 was replaced with a beta-binomial likelihood, $$ -\text{deaths}_{ast} \sim \text{Beta-Binomial}(m_{ast} \rho, (1 - m_{ast}) \rho, \text{Population}_{ast}). +\text{deaths}_{ast} \sim \text{Beta-Binomial}(m_{ast} \rho, (1 - m_{ast}) \rho, \text{N}_{ast}), $$ {#eq-ch-6-likelihood} -where $m_{ast}$ is the death rate and $\rho \geq 0$ is the overdispersion parameter. +where $m_{ast}$ is the death rate, $\text{N}_{ast}$ is the population and $\rho \geq 0$ is the overdispersion parameter. +The beta-binomial distribution has mean $m_{ast}\text{N}_{ast}$ and variance $\text{N}_{ast} \frac{\rho + \text{N}_{ast}}{\rho + 1} m_{ast} (1 - m_{ast})$. I found that the variability of the LSOA-level mortality data was such that death rates did near 1, violating the assumption for @eq-ch-4-likelihood-1 that mortality is low. In fact, when I tested a negative binomial or Poisson likelihood, I found the death rates for some age-LSOA-year combinations exceeded 1, which of course is impossible. The beta-binomial likelihood is a generalisation of the binomial distribution that allows for overdispersion. @@ -58,6 +59,11 @@ The corresponding estimates for life expectancy inequality in 2019 calculated us Life expectancy in 2019 was highest in LSOAs in central London districts of Kensington and Chelsea, Westminster, City of London and Camden, in the southwest (Richmond upon Thames and Kingston upon Thames) and parts of the northwest (e.g., parts of Harrow and Barnet), with life expectancy in many LSOAs surpassing 90 years (@fig-ch-6-map-level). Low life expectancy was spread in LSOAs throughout the city but was more common in outer east and southeast London. +![Median estimates of LSOA-level life expectancy in London in 2019 against the 95% credible intervals.](../thesis-analysis/thesis_analysis/london/figures/uncertainty.pdf){#fig-ch-6-uncertainty} + +In general, as life expectancy increased, the credible interval of life expectancy increased (@fig-ch-6-uncertainty). +The credible interval was particularly wide for some LSOAs with the highest female life expectancy estimates, with some credible intervals exceeding 30 years. + ### Change in life expectancy and inequality ![Geography of change in life expectancy in London from 2002 to 2019.](../thesis-analysis/thesis_analysis/london/figures/map_change.pdf){#fig-ch-6-map-change} @@ -87,13 +93,13 @@ The large decline in life expectancy in Kensington and Chelsea in 2017 is due to There was substantial variation in the size of life expectancy increase over short distances. As a result of this spatial heterogeneity, life expectancy inequality increased not only in London as a whole, but also in every district in London alongside increasing average life expectancy (@fig-ch-6-geofacet and @fig-ch-6-ridges). -::: {#fig-ch-6-ridges layout-ncol=1 fig-scap="Distribution of estimates of Lower-layer Super Output Area (LSOA) life expectancy at birth for 2002 and 2019, and of the change from 2002 to 2019 in 33 London districts."} +::: {#fig-ch-6-ridges layout-ncol=1 fig-scap="Distribution of median estimates of LSOA life expectancy at birth for 2002 and 2019, and of the change from 2002 to 2019 in 33 London districts."} ![Women](../thesis-analysis/thesis_analysis/london/figures/women_ridges.pdf){#fig-ch-6-ridges-women} ![Men](../thesis-analysis/thesis_analysis/london/figures/men_ridges.pdf){#fig-ch-6-ridges-men} -Distribution of estimates of Lower-layer Super Output Area (LSOA) life expectancy at birth for 2002 and 2019, and of the change from 2002 to 2019 in 33 London districts. +Distribution (density plot) in 33 London districts of median estimates of LSOA life expectancy at birth for 2002 and 2019, and of the change from 2002 to 2019. Districts are ordered by the median life expectancy in 2002. ::: @@ -121,6 +127,7 @@ Carrying out the study at the LSOA level uncovered inequalities to a fuller exte I defined life expectancy inequality as the difference between 2.5$^{\text{th}}$ and 97.5$^{\text{th}}$ percentiles of LSOA life expectancies rather than the difference between the maximum and minimum as in @sec-Chapter5. This was because the LSOAs with extremely low life expectancies tended to contain age­-LSOA-­year combinations in which the number of deaths exceeded the population. +There are also some credible intervals of life expectancy in excess of 30 years for women, which questions the epidemiological plausibility of these estimates. Small-area population, which is the denominator of age-specific death rates, is estimated by the ONS for intercensal years, and may be subject to error. This is especially the case in older ages when some people live and die in a long-term care facility, and may be counted towards population (denominator of death rates) in their original LSOA of residence and towards deaths (numerator of death rates) in the LSOA where the care facility is located. There were a higher proportion (0.099% compared to 0.001% in @sec-Chapter5) of spatial units with this issue at the LSOA level. diff --git a/thesis/Chapters/Chapter7.qmd b/thesis/Chapters/Chapter7.qmd index 7101553..dd3bd3b 100644 --- a/thesis/Chapters/Chapter7.qmd +++ b/thesis/Chapters/Chapter7.qmd @@ -16,14 +16,17 @@ There were no age­-district-year combinations in which the number of deaths fro ### Grouping causes of death -Each death record in post-neonatal ages was assigned an ICD-10 code corresponding to the underlying cause of death. +Each death record in the database contains a series of ICD-10 codes from the death certificate, which have been signed off by a medical professional, containing firstly the codes corresponding to the conditions leading directly to death, and then any other codes that contributed to the death. +Based on this list of codes, the ONS assigns each death record in post-neonatal ages an ICD-10 code corresponding to the underlying cause of death (the disease or injury that initiated the train of events directly leading to death) using standardised computer-based selection algorithms [@officefornationalstatisticsUserGuideMortality2022]. For neonates, which are not assigned an underlying cause of death, I used the ICD-10 code in the first position on the death record. -I used ICD-10 codes to assign each death to 136 cause groups of the WHO Global Health Estimates (GHE) study [@worldhealthorganizationWHOMethodsData2020]; these groups encompass causes of death with related aetiology and clinical and public health relevance. + +I used ICD-10 code for the underlying cause of death to assign each death to 136 cause groups of the WHO Global Health Estimates (GHE) study [@worldhealthorganizationWHOMethodsData2020]; these groups encompass causes of death with related aetiology and clinical and public health relevance. I also grouped diabetes mellitus and nephritis and nephrosis (hereafter referred to as _diabetes_) as these deaths might have overlapping history. I used the top twelve causes of death for each sex according to the total number of deaths from 2002 to 2019 for cause-specific analysis, as well as a residual groups comprising deaths from all other cancers, all other non-communicable diseases (NCDs), all other cardiovascular diseases (CVDs), all other infections, maternal, perinatal and nutritional conditions (IMPN), and injuries (external causes). Together, these form a mutually exclusive, collectively exhaustive list of causes of death (@fig-ch-7-treemap). The full list of ICD-10 codes for each cause group can be found in @tbl-ap-ch7-causes. +Any death records with an ill-defined disease (garbage code) as the underlying final cause of death were randomly assigned to one of the residual disease groups (all other cancers, all other CVDs, all other NCDs, all other IMPN) with probability proportional to the existing age-sex-year totals in those disease groups. ![Total number of deaths for the twelve leading causes of death in England from 2002 to 2019, and the residual groups all other cancers, all other NCDs, all other CVDs, all other infections, maternal, perinatal and nutritional conditions (IMPN), and injuries. The boxes are coloured by the wider groups of CVDs; NCDs; cancers; maternal, perinatal, nutritional and infectious causes; injuries. See @tbl-ap-ch7-causes for ICD-10 codes for each category.](../thesis-analysis/thesis_analysis/causes/figures/treemap.pdf){#fig-ch-7-treemap fig-scap="Total number of deaths for the twelve leading causes of death in England from 2002 to 2019 and residual groups."} @@ -72,12 +75,21 @@ Death rates for this total mortality group, and for injuries, were corrected in I calculated the contributions of deaths from each cause of death, in each age group, to both the life expectancy inequality between each district and the district with the highest life expectancy, and to the life expectancy change for each district between different time periods. I used Arriaga's method, which is widely used to decompose life expectancy differences between populations or population subgroups [@arriagaMeasuringExplainingChange1984]. -Arriaga's method calculates how much each age group contributes to the life expectancy difference by quantifying how much death rate differences at that age change the years of life lived both at that age and in subsequent ages through changing the number of survivors. -It then partitions the age-specific contributions to the life expectancy gap by cause of death in proportion to the difference in cause-specific death rates between the subgroups. -The cause-specific death rates were scaled such that the sum over all causes was equal to the estimate for total mortality. +Arriaga's method uses quantities generated from the life tables of two populations to calculate how much each age group contributes to the life expectancy difference, ${}_{n}\Delta_{x}$, by looking at differences in the number of survivors in the hypothetical cohort of the life table at different ages. +It then partitions the age-specific contributions to the life expectancy gap by cause of death, ${}_{n}\Delta^i_{x}$, in proportion to the difference in cause-specific death rates between the subgroups: +$$ +{}_{n}\Delta^i_{x} = {}_{n}\Delta_{x} \cdot \frac{{}_{n}m^i_{x}(2) - {}_{n}m^i_{x}(1)}{{}_{n}m_{x}(2) - {}_{n}m_{x}(1)}. +$$ {#eq-ch-7-arriaga-cause} +where ${}_{n}m^i_{x}$ are age-specific death rates for cause $i$. +Arriaga showed the sum of the age- and cause-specific contributions are equal to the difference in life expectancy, +$$ +e_0(2) - e_0(1) = \sum_x {}_{n}\Delta_{x} = \sum_i {}_{n}\Delta^i = \sum_x \sum_i {}_{n}\Delta^i_{x}. +$$ {#eq-ch-7-arriaga-sum} + +So that the cause-specific contributions summed exactly to the differences in life expectancy as in @eq-ch-7-arriaga-sum, the cause-specific death rates, which were estimated in separate model runs, were scaled such that the sum over all causes was equal to the estimate for all-cause death rate, which itself was estimated in a separate run. For this analysis, I used the sample mean death rate in each age-district-year-cause combination. -Details on the calculations for life expectancy, the probability of dying, and Arriaga's method can be found in @sec-appA. +Further details on the calculations for life expectancy, the probability of dying, and Arriaga's method can be found in @sec-appA. ## Results @@ -232,7 +244,10 @@ This study did not look at age- _and_ cause-specific contributions to life expec Although this masks some variation, the groups were selected based on the total number of deaths from 2002 to 2019, and are consequently skewed towards older ages, and the age-specific contributions of (e.g.) dementias are not particularly interesting. Death records are subject to issues in the assignment of ICD-10 codes for the cause of death. -Although the ONS use selection algorithms to improve consistency between doctors when identifying the underlying cause of death [@officefornationalstatisticsUserGuideMortality2022], the challenge of multimorbidity in older age groups makes the assignment of cause of death increasingly difficult [@mesleCausesDeathVery2021]. +The quality of the cause of death data in the UK has been given the highest rating based on completeness and a low share of deaths assigned to implausible and ill-defined codes [@worldhealthorganizationWHOMethodsData2020]. +There are national guidelines for death certification, and the ONS uses standardised coding algorithms and computer-based assignment of the cause of death, which assign the underlying cause of death uniformly to improve consistency between doctors when identifying the underlying cause of death [@officefornationalstatisticsUserGuideMortality2022]. +For example, a validation study found that that death certificates in the UK accurately identified deaths from prostate cancer [@turnerContemporaryAccuracyDeath2016]. +Despite this, the challenge of multimorbidity in older age groups makes the assignment of cause of death increasingly difficult [@mesleCausesDeathVery2021]. We selected 80 years of age as the upper bound to partially mitigate this effect because it covers a wide age range but does not include the very oldest ages. However, for certain diseases such as dementias where only 96,109 (14.5% of 663,692) of deaths occur in those under 80 years, the probability of dying between birth and 80 years masks variations in the age groups where the majority of deaths occur. There are also specific cases of problematic cause of death assignment in the data. @@ -293,6 +308,7 @@ These contrasting trends in mortality from ischaemic heart disease and dementias The sizeable contribution to the inequality in life expectancy improvement from all other NCDs is more difficult to explain without further stratifying the cause group. The heterogeneous trends in mortality from both lung cancer, where the probability of dying declined in all districts for men and saw mixed trends for women, and from COPD, where a larger proportion of districts experienced a decrease in mortality for women than for men, reflected that the peak in female smoking rates and smoking-attributable mortality have lagged behind that in men by about 20-30 years [@thunStagesCigaretteEpidemic2012]. +This may also be responsible for the larger contributions of lung cancer and COPD mortality for women to inequality in life expectancy across districts. The geography of the change in mortality from liver cirrhosis – an advanced stage of liver damage – is perhaps indicative of the contrasting dynamics of the two main risk factors for liver cirrhosis: alcohol misuse and hepatitis B/C infection. Alcohol is the main cause of liver disease, and has driven a large proportion of increases in liver cirrhosis throughout Europe [@blachierBurdenLiverDisease2013]. diff --git a/thesis/Chapters/Chapter8.qmd b/thesis/Chapters/Chapter8.qmd index 36f2854..47d2a6e 100644 --- a/thesis/Chapters/Chapter8.qmd +++ b/thesis/Chapters/Chapter8.qmd @@ -1,6 +1,7 @@ # Trends in cancer mortality at the district level {#sec-Chapter8} -The work in this chapter has formed the paper _Inequalities in mortality from leading cancers in districts of England from 2002 to 2019: high-resolution spatiotemporal analysis of vital registration data_, which is under review at _The Lancet Oncology_ and for which I am first author. +The work in this chapter has formed the paper _Mortality from leading cancers in districts of England from 2002 to 2019: a population-based, spatiotemporal study_, published in _The Lancet Oncology_ [@rashidMortalityLeadingCancers2024] and for which I am first author. +The paper was published under the CC-BY license, which permits reproduction provided the original work is properly cited. As seen in the previous chapter, mortality from cancers in England has declined more slowly than from other major causes of death such as CVDs, and hence the share of deaths from cancers has steadily increased. In 2019, there were more deaths from cancers (144,306 (28.9% of all deaths in 2019)) than from CVDs 126,105 (25.3%), although this was less than the number of deaths from other NCDs excluding CVDs and cancers (173,323 (34.8%)) which reflects the rise in dementia deaths in recent years. @@ -8,10 +9,15 @@ Compared with CVDs and other NCDs, which were dominated by their major causes (i With over 150 different types of cancer, each with their own anatomical and molecular subtypes, cancer is extremely complex, with a specialist workforce dedicated to each unique cancer. I felt I should go deeper into the cancer story, and pay further attention to a wider array of site-specific cancers. +Subnational data on trends in cancer mortality are currently limited to large areas such as health boards or regions [@arikSocioeconomicDisparitiesCancer2021; @nhsdigitalCancerData2022]. +Small-area data can guide where there is a need for primary prevention strategies to reduce incidence, and for health-care planning and delivery to improve survival. +This evidence is particularly relevant for cancers, because their incidence and survival can be affected by a range of risk factors, screening to detect precancerous lesions and early-stage disease, and effective treatments. + ## Methods The methodology for this section is the same as @sec-Chapter7, but I have stratified cancer groups further. I used the top ten leading cancer causes of death according to the total number of deaths from 2002 to 2019 for cause-specific analysis, as well as a residual group comprising all other cancer deaths. +The residual group also contained deaths from ill-defined diseases, which were assigned as described in @sec-Chapter7. There were 2,453,173 deaths from cancers in England from 2002 to 2019; of these, 1,533,703 (62.5%) deaths occurred before 80 years of age (@fig-ch-8-treemap). Of cancer deaths before 80 years of age, 697,953 (45.5%) were deaths in women and 835,750 (54.5%) in men. @@ -103,10 +109,10 @@ The probability of dying from a cancer before 80 years of age declined from 2002 Districts in London achieved the largest declines. Among cause categories, the largest reductions in mortality were for stomach cancer, with the declines in probability of dying before 80 years of age ranging between 39.1% and 57.2% for women across districts, and between 51.5% and 58.8% for men from 2002 to 2019, all with posterior probability >0.99. The probability of dying from oesophageal cancer also decreased in every district for women, but varied for men from a 42.9% (27.7% to 56.0%) decrease in Plymouth to a 7.0% (-19.8% to 42.4%) increase in Gosport (@fig-ch-8-change and @fig-app-e-map-oesophagus). -The posterior probability that the observed decline was a true decline was >0.80 in 242 (77.1%) districts for oesophageal cancer in men. +The posterior probability that the district saw a decline in mortality from oesophageal cancer was >0.80 in 242 (77.1%) districts in men. Lung cancer mortality decreased everywhere for men with posterior probability >0.99, but for women, there were mixed trends. The largest declines were in London, the strongest seen in Newham (decrease of 29.5% (18.5% to 38.8%)), whereas the probability of dying increased in many districts in the East of England, with the largest increase of 27.0% (6.8% to 49.7%) in Tendring (@fig-ch-8-change and @fig-app-e-map-lung). -The posterior probability that the observed decline for lung cancer in women was a true decline was >0.80 in 197 (62.7%) districts. +The posterior probability that the district observed a decline in mortality from lung cancer in women was >0.80 in 197 (62.7%) districts. Women in 4 (1.3%) districts experienced an increase in lung cancer mortality with a posterior probability >0.80, and in the remaining 113 (36.0%) districts there was no clear trend at this level of posterior probability. ::: {#fig-ch-8-change layout-ncol=1 fig-scap="Ranked change in probability of dying between birth and 80 years of age in 314 local authority districts in England in 2002 and 2019 for the ten leading cancers."} @@ -151,8 +157,9 @@ Two studies investigated trends in cancer mortality, although with limited resol Both studies similarly found that areas in London were exceptional, with generally lower rates of premature cancer mortality than expected for their level of deprivation [@steelChangesHealthCountries2018], attenuated inequality between the most and least deprived deciles, and lower mortality for prostate cancer [@arikSocioeconomicDisparitiesCancer2021; @arikUnevenOutcomesFindings2022]. The latter study, which only focussed on the four leading cancers, reported that in the worst performing regions inequality between the top and bottom deciles in female lung cancer mortality was similar to the 3.7-fold gap between the top and bottom districts seen in this study. -Elsewhere, cancer atlases have been limited to coarse geographical units and aggregate multiple years of data rather than presenting information on trends. -For smaller areas, a study in the USA at the county level found that lung cancer and stomach cancer were among the cancers with the largest inequalities in mortality across counties in 2014, but there was limited variation for leukaemia, lymphoma and multiple myeloma, which is consistent with the results presented here [@mokdadTrendsPatternsDisparities2017]. +Beyond the UK, cancer atlases have been largely limited to coarse geographical units and aggregate multiple years of data rather than presenting information on trends. +There have been examples of cancer atlases for smaller areas for Spain and Portugal [@fernandez-navarroAtlasCancerMortality2021] and Australia [@cancercouncilqueenslandAustralianCancerAtlas2018]. +A study in the USA on trends in cancer mortality at the county level found that lung cancer and stomach cancer were among the cancers with the largest inequalities in mortality across counties in 2014, but there was limited variation for leukaemia, lymphoma and multiple myeloma, which is consistent with the results presented here [@mokdadTrendsPatternsDisparities2017]. The study also found that the strongest increases over time were in liver cancer mortality, with nearly all counties seeing an increase. ### Explaining the variation and implications diff --git a/thesis/Chapters/Chapter9.qmd b/thesis/Chapters/Chapter9.qmd index d2ac5dd..2d265a8 100644 --- a/thesis/Chapters/Chapter9.qmd +++ b/thesis/Chapters/Chapter9.qmd @@ -9,7 +9,9 @@ Although the complexity of policy means it cannot be proved causally, leaders wi The declines in life expectancy were sustained over a long period of time, which serves as another example where death rates for some population subgroups run contrary to the persistent mortality decline of the third stage of the Epidemiologic Transition theory, as discussed in @gaylinRefocusingLensEpidemiologic1997 with the HIV/AIDS pandemic. Even if England is in the hypothesised fourth stage of the transition, the Age of Delayed Degenerative diseases [@olshanskyFourthStageEpidemiologic1986], there are subnational patterns where degenerative diseases are *not* killing at later and later ages. -The difference in progress between districts in the last decade was largely driven by differences in these degenerative diseases - in particular, the rate of improvement for CVDs and all other NCDs, and the strength of the negative forcing effect of Alzheimer's and other dementias. +This supports the growing evidence that Omran's theory acts only as a useful heuristic, and that there is a large amount of variability, particularly in the latter stages of the transition, between broad geographical regions [@mackenbachOmranEpidemiologicTransition2022; @sudharsananLargeVariationEpidemiological2022]. + +The difference in progress between districts in the last decade was largely driven by differences in degenerative diseases - in particular, the rate of improvement for CVDs and all other NCDs, and the strength of the negative forcing effect of Alzheimer's and other dementias. Furthermore, female mortality from infectious, maternal, perinatal and nutritional conditions (GBD group 1), which dominate the second stage of the transition, increased in many districts. There is also a worrying shift towards injuries (GBD group 3) contributing negatively towards life expectancy progress, particularly for men. This is possibly driven by a rise in "deaths of despair" [@angusIncreasesDeathsDespair2023; @caseRisingMorbidityMortality2015], although this would require further analysis by separating intentional and unintentional injuries. @@ -31,7 +33,7 @@ Nevertheless, the estimates from this thesis are already being used in the press There are many possible methodological and substantive extensions to the work presented in this thesis. Firstly, on the methodological side, with improvements both to hardware and the rise of approximate inference algorithms to replace computationally costly sampling methods, the models can, in theory, be scaled to higher and higher spatial resolutions and we could potentially estimate mortality for the entire country for LSOAs, OAs, or even postcodes. -However, given the data issues in @sec-Chapter5, perhaps smaller is not better when the quality of the data is lacking. +However, given the data issues in @sec-Chapter5 and the very wide credible intervals seen for some LSOAs with high life expectancies, perhaps smaller is not better when the quality of the data is lacking. One of the major strengths of the thesis was the use of Bayesian methods, at one of the highest spatial resolutions in the literature for a model estimating mortality. This was largely thanks to recent developments in probabilistic programming, which allow sampling algorithms to run on GPUs rather than CPUs, and is generally faster for models with over 10,000 parameters [@laoTfpMcmcModern2020]. diff --git a/thesis/_thesis/Appendices/AppendixA.html b/thesis/_thesis/Appendices/AppendixA.html index c421e7a..251ff69 100644 --- a/thesis/_thesis/Appendices/AppendixA.html +++ b/thesis/_thesis/Appendices/AppendixA.html @@ -252,49 +252,61 @@

Appendix

A.1 Period life tables

Calculating life expectancy for a cohort is possible, but you have to wait until every member of the cohort has died. Instead, demographers use period (or “current”) life tables, which consider what would happen to a hypothetical cohort that are subjected to the deaths rates in each age group at an exact period in time. Life tables can be constructed using discrete age bands starting at age \(x\) and ending at age \(x+n\). We supply the age-specific death rates, \({}_{n}m_{x}\), and the average person-years lived by those dying in the interval, \({}_{n}a_{x}\), and the life table calculates the mean age at death – the life expectancy, \(e_x\).

-

We start with a hypothetical cohort of size \(l_0 = 100,000\) and sequentially apply the probability of dying in each age group, calculated as \[ +

The probability of dying, \({}_{n}q_{x}\), is defined as the ratio of the number of people who died in the age interval, \({}_{n}d_{x}\), to the number who survived to age \(x\), \(l_x\): \[ +{}_{n}q_{x} = \frac{{}_{n}d_{x}}{l_x}. +\tag{A.1}\]

+

The age-specific death rate is defined as the ratio of the number of people who died in the age interval to the total number of person-years lived, \({}_{n}L_{x}\), which is the weighted sum of the number of person-years lived (\(n\)) by those who survived, which, in turn, is the difference between those who survived to age \(x\) and those who died in the interval (\(l_x - {}_{n}d_{x}\)), and the number of person-years lived on average (\({}_{n}a_{x}\)) by those who died (\({}_{n}d_{x}\)): \[ +{}_{n}m_{x} = \frac{{}_{n}d_{x}}{n \cdot (l_x - {}_{n}d_{x}) + {}_{n}a_{x} \cdot {}_{n}d_{x}}. +\tag{A.2}\] We assume the denominator of Equation A.2 can be approximated by the mid-year population, \({}_{n}P_{x}\), which leads us to recover the expression for the cross-sectional, empirical death rate in Equation 4.1. By rearranging the denominator to make the number of survivors the subject, we obtain \[ +l_x = \frac{1}{n} \left({}_{n}P_{x} + (n - {}_{n}a_{x} \cdot {}_{n}d_{x})\right). +\tag{A.3}\] We can substitute this expression into Equation A.1 and divide by \({}_{n}P_{x}\) to obtain
+\[ {}_{n}q_{x} = \frac{n \cdot {}_{n}m_{x}}{1 + (n - {}_{n}a_{x}) {}_{n}m_{x}}. -\tag{A.1}\] The open interval \({}_{\infty}q_{x} = 1\), as nobody is immortal. Using the probability of surviving in each age group, \({}_{n}p_{x} = 1 - {}_{n}q_{x}\), the number of survivors is given by \[ +\tag{A.4}\] This expression, although unintuitive, allows us to convert from \({}_{n}m_{x}\) to \({}_{n}q_{x}\) with only the parameter \({}_{n}a_{x}\).

+

In the period life table, we start with a hypothetical cohort of size \(l_0 = 100,000\) and sequentially apply the probability of surviving in each age group, \({}_{n}p_{x} = 1 - {}_{n}q_{x}\), to calculate the number of survivors as \[ l_{x+n} = l_x \cdot {}_{n}p_{x}. -\tag{A.2}\]

-

The number of person-years lived is the sum of the number of survivors weighted by the band width and number of people who died weighted by \({}_{n}a_{x}\) \[ -{}_{n}L_{x} = n \cdot l_x + {}_{n}a_{x} \cdot l_{x} \cdot {}_{n}q_{x} \quad {}_{\infty}L_{x} = \frac{l_x}{{}_{\infty}m_{x}}, -\tag{A.3}\]

-

and the total number of person-years lived above \(x\) is \[ +\tag{A.5}\]

+

The number of person-years lived is the sum of the number of survivors weighted by the band width and number of people who died (\({}_{n}d_{x} = l_{x} \cdot {}_{n}q_{x}\)) weighted by \({}_{n}a_{x}\) \[ +{}_{n}L_{x} = n \cdot l_x + {}_{n}a_{x} \cdot l_{x} \cdot {}_{n}q_{x}. +\tag{A.6}\]

+

The open interval \({}_{\infty}q_{x} = 1\), as nobody is immortal. Using Equation A.1, it follows that the number of deaths in this interval is equal to the number who survived to the final age group, i.e. \({{}_\infty}d_{x} = l_x\). Since the death rate from Equation A.2 can be rewritten using the number of person-years lived, \({}_{n}L_{x}\), as the denominator and we can substitute the number of deaths with the number surviving to the final age group, we can obtain an expression for the number of person-years lived in the open-ended age interval \[ +{}_{\infty}L_{x} = \frac{{}_{\infty}d_{x}}{{}_{\infty}m_{x}} = \frac{l_x}{{}_{\infty}m_{x}}. +\tag{A.7}\]

+

The total number of person-years lived above \(x\) is \[ T_{x} = \sum^{\infty}_{x = a} {}_{n}L_{x}. -\tag{A.4}\]

-

Then, life expectancy is given by dividing the number of person-years lived by the number of people who will live them \[ +\tag{A.8}\]

+

Then, life expectancy is given by dividing the number of person-years lived by the number of people who will live them \[ e_x = \frac{T_x}{l_x}. -\tag{A.5}\]

+\tag{A.9}\]

Throughout the thesis, I only consider life expectancy at birth.

A.1.1 The very young ages and the very old ages

-

On average, it is a good approximation to assume deaths occur halfway through the age interval: \({}_{n}a_{x} = n /2\). But for younger ages, particularly at lower levels of mortality, the majority of infant deaths lie further towards the earliest stages of infancy. Coale and Demeny used regression on a series of international datasets to recommend suitable values for \({}_{1}a_{0}\) and \({}_{4}a_{1}\) instead of the midpoint (Coale et al., 1983).

-

The start of the open age group can be many years away from some of the ages at death, particularly in ageing populations. In order to produce reliable estimates of death rates at high ages, I used the Kannisto-­Thatcher method to expand the terminal age group (\(\geq 85\) years) of the life table and adjust \({}_{n}a_{x}\) above 70 years (Thatcher et al., 2002).

+

On average, it is a good approximation to assume deaths occur halfway through the age interval: \({}_{n}a_{x} = n / 2\). But for younger ages, particularly at lower levels of mortality, the majority of infant deaths lie further towards the earliest stages of infancy. Coale and Demeny used regression on a series of international datasets to recommend suitable values for \({}_{1}a_{0}\) and \({}_{4}a_{1}\) instead of the midpoint (Coale et al., 1983).

+

The start of the open-ended age group can be many years away from some of the ages at death, particularly in ageing populations. In order to produce reliable estimates of death rates at older ages, I used the Kannisto-­Thatcher method to expand the terminal age group (\(\geq 85\) years) of the life table and adjust \({}_{n}a_{x}\) above 70 years (Thatcher et al., 2002). The Kannisto-Thatcher method assumes the probability of dying is a logistic function of age. The logit-transformed probability of dying above 70 years is regressed upon age. The resulting curve is extrapolated through to 129 years before calculating the number of survivors in the cohort following the adjusted probability of dying to estimate \({}_{n}a_{x}\) above 70 years.

A.2 Probability of dying

-

The probability of dying from a specific cause of death, \(i\), is calculated as in Equation A.1. Equally, we can subtract the probability of surviving to that age group, \(1 - \prod_x {}_{n}p^i_{x}\). Note, even for the smallest death rates, \({}_{\infty}q^i_{x} = 1\) – if you live to infinity, you’ll die of it eventually.

+

The probability of dying from a specific cause of death, \(i\), is calculated as in Equation A.4. Equally, we can calculate the probability of dying by subtracting the probability of surviving in each age group through to that age from unity, i.e. \(1 - \prod_x {}_{n}p^i_{x}\). Note, even for the smallest death rates, \({}_{\infty}q^i_{x} = 1\) – if you live to infinity, you’ll die of it eventually.

A.3 Cause-specific decomposition of differences in life expectancy

-

Arriaga (1984) proposed a method to calculate the age-specific contributions to the difference in life expectancy between two populations as \[ +

Using quantities generated from the life tables of two populations as above, Arriaga (1984) proposed a method to calculate the age-specific contributions to the difference in life expectancy between these populations as \[ {}_{n}\Delta_{x} = \frac{l^1_x}{l^1_0} \left( \frac{{}_{n}L^2_{x}}{l^2_x} - \frac{{}_{n}L^1_{x}}{l^1_x} \right) + \frac{T^2_{x+n}}{l^1_0} \left( \frac{l^1_x}{l^2_x} - \frac{l^1_{x+n}}{l^2_{x+n}} \right). -\tag{A.6}\]

-

We then assume the age- and cause-specific contributions are proportional to the difference in cause-specific death rates: \[ +\tag{A.10}\] The first term on the right hand side corresponds to the “direct effect” on the life expectancy difference between the two populations in the average number of person-years lived by the survivors to that age group (\({}_{n}L_{x} / l_x\)). The second term represents the “indirect effect” on the number of survivors caused by the mortality changes within an age group.

+

We then assume the age- and cause-specific contributions are proportional to the difference in cause-specific death rates between the two populations: \[ {}_{n}\Delta^i_{x} = {}_{n}\Delta_{x} \cdot \frac{{}_{n}m^i_{x}(2) - {}_{n}m^i_{x}(1)}{{}_{n}m_{x}(2) - {}_{n}m_{x}(1)} -\tag{A.7}\]

+\tag{A.11}\]

Arriaga showed the sum of the age- and cause-specific contributions are equal to the difference in life expectancy, \[ e_0(2) - e_0(1) = \sum_x {}_{n}\Delta_{x} = \sum_x \sum_i {}_{n}\Delta^i_{x}. -\tag{A.8}\]

+\tag{A.12}\]

So, we can collapse over age groups to get the cause-specific contributions to life expectancy as \(\sum_x {}_{n}\Delta^i_{x}\).

A.4 Mean age at death

The mean age at death among those who died from a specific cause of death was calculated as \[ \text{mean age at death} = \frac{\sum_x {}_{n}d_{x} \cdot {}_{n}a_{x}}{\sum_x {}_{n}d_{x}}, -\tag{A.9}\] where \({}_{n}d_{x}\) is the number of deaths in an age band, calculated as the product of the death rate and the population.

+\tag{A.13}\] where \({}_{n}d_{x}\) is the number of deaths in an age band, calculated as the product of the death rate and the population.

2.2.2 Applications of disease mapping methods

Small-area analyses of mortality

-

In order to compare the health status between areas, health authorities require a measure of mortality that collapses age-specific information into a single number. Indirectly standardised measures such as the standardised mortality ratio – the ratio between total deaths and expected deaths in an area – are easy to calculate, but are not easily understood by laypeople. Directly standardised methods, in contrast, require knowledge of the full age structure of death rates rather than just the total number of deaths. Age-standardised death rates, however, suffer the same interpretability issue as the standardised mortality ratio, and are only comparable between studies if the same reference population is used. An alternative choice is life expectancy. Silcocks et al. (2001) explain that life expectancy is a “more intuitive and immediate measure of the mortality experience of a population, [and] is likely to have greater impact… than other measures that are incomprehensible to most people.”

+

In order to compare the health status between areas, health authorities require a measure of mortality that collapses age-specific information into a single number. Indirectly standardised measures such as the standardised mortality ratio – the ratio between total deaths and expected deaths in an area – are easy to calculate, but are not easily understood by laypeople. Directly standardised methods, in contrast, require knowledge of the full age structure of death rates rather than just the total number of deaths. Age-standardised death rates, however, suffer the same interpretability issue as the standardised mortality ratio, and are only comparable between studies if the same reference population is used. An alternative choice is life expectancy. Silcocks et al. (2001) explain that life expectancy is a “more intuitive and immediate measure of the mortality experience of a population, [and] is likely to have greater impact… than other measures that are incomprehensible to most people.” However, although the metric appears more interpretable, life expectancy at birth constructed from a period life table is often misinterpreted as the mean length of life of the cohort into which the newborn is born. In fact, it measures the expectation of life assuming that the newborn will be exposed to age-specific mortality conditions throughout their life that are exactly the same as the current population.

The estimation of death rates requires two data sources: deaths counts and populations. Modern death registration systems, such as that of the UK, are almost entirely complete and accurate. On the other hand, although usually treated as a known quantity, the population denominator is often problematic. Populations for small geographies are only recorded during a decennial census, and estimates are generated for the years in-between using limited survey data on births, deaths and migration. And although the census is considered the “gold standard”, it is subject to enumeration errors, particularly for areas with special populations such as students or armed forces (Elliott et al., 2001b).

Beyond the population issue, finer scale studies are restricted by data availability. Where data are available, there is still the need to overcome small number issues before feeding death rates through the life table to calculate life expectancy. Eayres and Williams (2004) recommend a minimum population size of 5000 when using traditional life table methods, below which the calculation of life expectancy is unstable1, or the error estimates become so large that any comparison between subgroups becomes meaningless. One approach, often taken by statistical agencies, is to build larger populations by either aggregating multiple years of data (Bahk et al., 2020; Office for National Statistics, 2015; Public Health England, 2021) or combining spatial units (Ezzati et al., 2008). Here, we focus on studies using Bayesian hierarchical models to generate robust estimates of age-specific death rates by recognising the correlations between spatial units and age groups, which produce more accurate estimates for small population studies of life expectancy (Congdon, 2009; Jonker et al., 2012).

Jonker et al. (2012) demonstrated the advantages of the Bayesian approach for 89 small areas in Rotterdam using a joint model for sex, space and age effects, finding a 8.2 year and 9.2 year gap between the neighbourhoods with the highest and lowest life expectancies for women and men. Stephens et al. (2013) employed the same model for 153 administrative areas in New South Wales, Australia.

@@ -358,7 +358,7 @@

Cla

Spatial inequality

-

In 2015, the GBD study released its first subnational estimates of mortality, starting with the UK and Japan. Steel et al. (2018) assessed these data, which divided the UK into 150 regions, finding mortality from all-causes varied twofold across the country, with the highest years of life lost in Blackpool and the lowest in Wokingham. In a study on forecasting subnational life expectancy in England and Wales, Bennett et al. (2015) estimated a 8.2 year range in life expectancy for men and 7.1 year range for women in 2012 between 375 districts. The lowest life expectancies were seen in urban northern England, and the highest in the south and London’s affluent districts. Within London itself, male and female life expectancy showed 5-6 years of variation.

+

In 2015, the GBD study released its first subnational estimates of mortality, starting with the UK and Japan. Steel et al. (2018) assessed these data, which divided the UK into 150 regions, finding mortality from all-causes varied twofold across the country, with the highest years of life lost in Blackpool and the lowest in Wokingham. In a study on forecasting subnational life expectancy in England and Wales, Bennett et al. (2015) estimated a 8.2 year range in life expectancy for men and 7.1 year range for women in 2012 between 375 districts. The lowest life expectancies were seen in urban northern England, and the highest in the south and London’s affluent districts. Within London itself, Cheshire (2012) visualised the heterogeneity of mortality in London by assigning tube stops the life expectancy of the nearest ward, revealing that 10 years are lost between two consecutive stops, Canary Wharf and North Greenwich, on the Jubilee line.

Deprivation

@@ -388,7 +388,7 @@

10.1016/j.puhe.2023.02.019
-Asaria P, Fortunato L, Fecht D, Tzoulaki I, Abellan JJ, Hambly P, de Hoogh K, Ezzati M, Elliott P. 2012. Trends and inequalities in cardiovascular disease mortality across 7932 English electoral wards, 1982: Bayesian spatial analysis. International Journal of Epidemiology 41:1737–1749. doi:10.1093/ije/dys151 +Asaria P, Fortunato L, Fecht D, Tzoulaki I, Abellan JJ, Hambly P, de Hoogh K, Ezzati M, Elliott P. 2012. Trends and inequalities in cardiovascular disease mortality across 7932 English electoral wards, 19822006: Bayesian spatial analysis. International Journal of Epidemiology 41:1737–1749. doi:10.1093/ije/dys151
Aylin P, Maheswaran R, Wakefield J, Cockings S, Jarup L, Arnold R, Wheeler G, Elliott P. 1999. A national facility for small area disease mapping and rapid initial assessment of apparent disease clusters around a point source: The UK Small Area Health Statistics Unit. Journal of Public Health 21:289–298. doi:10.1093/pubmed/21.3.289 @@ -414,6 +414,9 @@

Bilal U, Alazraqui M, Caiaffa WT, Lopez-Olmedo N, Martinez-Folgar K, Miranda JJ, Rodriguez DA, Vives A, Diez-Roux AV. 2019. Inequalities in life expectancy in six large Latin American cities from the SALURBAL study: An ecological analysis. The Lancet Planetary Health 3:e503–e510. doi:10.1016/S2542-5196(19)30235-9

+
+Cheshire J. 2012. Featured Graphic. Lives on the Line: Mapping Life Expectancy along the London Tube Network. Environment and Planning A: Economy and Space 44:1525–1528. doi:10.1068/a45341 +
Congdon P. 2014. Modelling changes in small area disability free life expectancy: Trends in London wards between 2001 and 2011. Statistics in Medicine 33:5138–5150. doi:10.1002/sim.6298
@@ -430,7 +433,7 @@

10.1001/jamainternmed.2017.0918
-Dwyer-Lindgren L, Stubbs RW, Bertozzi-Villa A, Morozoff C, Callender C, Finegold SB, Shirude S, Flaxman AD, Laurent A, Kern E, Duchin JS, Fleming D, Mokdad AH, Murray CJL. 2017b. Variation in life expectancy and mortality by cause among neighbourhoods in King County, WA, USA, 1990: A census tract-level analysis for the Global Burden of Disease Study 2015. The Lancet Public Health 2:e400–e410. doi:10.1016/S2468-2667(17)30165-2 +Dwyer-Lindgren L, Stubbs RW, Bertozzi-Villa A, Morozoff C, Callender C, Finegold SB, Shirude S, Flaxman AD, Laurent A, Kern E, Duchin JS, Fleming D, Mokdad AH, Murray CJL. 2017b. Variation in life expectancy and mortality by cause among neighbourhoods in King County, WA, USA, 19902014: A census tract-level analysis for the Global Burden of Disease Study 2015. The Lancet Public Health 2:e400–e410. doi:10.1016/S2468-2667(17)30165-2
Eayres D, Williams ES. 2004. Evaluation of methodologies for small area life expectancy estimation. Journal of Epidemiology & Community Health 58:243–249. doi:10.1136/jech.2003.009654 @@ -481,7 +484,7 @@

10.1191/0962280205sm389oa

-Hiam L, Dorling D, McKee M. 2020. Things Fall Apart: The British Health Crisis 2010. British Medical Bulletin 133:4–15. doi:10.1093/bmb/ldz041 +Hiam L, Dorling D, McKee M. 2020. Things Fall Apart: The British Health Crisis 20102020. British Medical Bulletin 133:4–15. doi:10.1093/bmb/ldz041
Hiam L, Harrison D, McKee M, Dorling D. 2018. Why is life expectancy in England and Wales “stalling”? J Epidemiol Community Health 72:404–408. doi:10.1136/jech-2017-210401 @@ -508,7 +511,7 @@

10.1198/016214502388618438

-Knorr-Held L. 2000. Bayesian modelling of inseparable space-time variation in disease risk. Statistics in Medicine 19:2555–2567. doi:10.1002/1097-0258(20000915/30)19:17/18{$<$}2555::AID-SIM587{$>$}3.0.CO;2-\%23 +Knorr-Held L. 2000. Bayesian modelling of inseparable space-time variation in disease risk. Statistics in Medicine 19:2555–2567. doi:10.1002/1097-0258(20000915/30)19:17/18<2555::AID-SIM587>3.0.CO;2-%23
Knorr-Held L, Best NG. 2001. A Shared Component Model for Detecting Joint and Selective Clustering of Two Diseases. Journal of the Royal Statistical Society Series A (Statistics in Society) 164:73–85. @@ -520,7 +523,7 @@

10.1038/s41591-020-1112-0

-Leon DA, Jdanov DA, Shkolnikov VM. 2019. Trends in life expectancy and age-specific mortality in England and Wales, 1970, in comparison with a set of 22 high-income countries: An analysis of vital statistics data. The Lancet Public Health 4:e575–e582. doi:10.1016/S2468-2667(19)30177-X +Leon DA, Jdanov DA, Shkolnikov VM. 2019. Trends in life expectancy and age-specific mortality in England and Wales, 19702016, in comparison with a set of 22 high-income countries: An analysis of vital statistics data. The Lancet Public Health 4:e575–e582. doi:10.1016/S2468-2667(19)30177-X
Mahaki B, Mehrabi Y, Kavousi A, Schmid VJ. 2018. Joint Spatio-temporal Shared Component Model with an Application in Iran Cancer Data. Asian Pacific Journal of Cancer Prevention 19:1553–1560. doi:10.22034/APJCP.2018.19.6.1553 @@ -586,13 +589,13 @@

10.1136/jech.55.1.38

-Steel N, Ford JA, Newton JN, Davis ACJ, Vos T, Naghavi M, Glenn S, Hughes A, Dalton AM, Stockton D, Humphreys C, Dallat M, Schmidt J, Flowers J, Fox S, Abubakar I, Aldridge RW, Baker A, Brayne C, Brugha T, Capewell S, Car J, Cooper C, Ezzati M, Fitzpatrick J, Greaves F, Hay R, Hay S, Kee F, Larson HJ, Lyons RA, Majeed A, McKee M, Rawaf S, Rutter H, Saxena S, Sheikh A, Smeeth L, Viner RM, Vollset SE, Williams HC, Wolfe C, Woolf A, Murray CJL. 2018. Changes in health in the countries of the UK and 150 English Local Authority areas 1990: A systematic analysis for the Global Burden of Disease Study 2016. The Lancet 392:1647–1661. doi:10.1016/S0140-6736(18)32207-4 +Steel N, Ford JA, Newton JN, Davis ACJ, Vos T, Naghavi M, Glenn S, Hughes A, Dalton AM, Stockton D, Humphreys C, Dallat M, Schmidt J, Flowers J, Fox S, Abubakar I, Aldridge RW, Baker A, Brayne C, Brugha T, Capewell S, Car J, Cooper C, Ezzati M, Fitzpatrick J, Greaves F, Hay R, Hay S, Kee F, Larson HJ, Lyons RA, Majeed A, McKee M, Rawaf S, Rutter H, Saxena S, Sheikh A, Smeeth L, Viner RM, Vollset SE, Williams HC, Wolfe C, Woolf A, Murray CJL. 2018. Changes in health in the countries of the UK and 150 English Local Authority areas 19902016: A systematic analysis for the Global Burden of Disease Study 2016. The Lancet 392:1647–1661. doi:10.1016/S0140-6736(18)32207-4
Stephens AS, Purdie S, Yang B, Moore H. 2013. Life expectancy estimation in small administrative areas with non-uniform population sizes: Application to Australian New South Wales local government areas. BMJ Open 3:e003710. doi:10.1136/bmjopen-2013-003710
-Taylor-Robinson D, Lai ETC, Wickham S, Rose T, Norman P, Bambra C, Whitehead M, Barr B. 2019. Assessing the impact of rising child poverty on the unprecedented rise in infant mortality in England, 2000: Time trend analysis. BMJ Open 9:e029424. doi:10.1136/bmjopen-2019-029424 +Taylor-Robinson D, Lai ETC, Wickham S, Rose T, Norman P, Bambra C, Whitehead M, Barr B. 2019. Assessing the impact of rising child poverty on the unprecedented rise in infant mortality in England, 20002017: Time trend analysis. BMJ Open 9:e029424. doi:10.1136/bmjopen-2019-029424
The Economist. 2023. Britain has endured a decade of early deaths. Why? The Economist 19–21. @@ -601,7 +604,7 @@

10.1054/bjoc.2001.1739

-Wakefield J, Elliott P. 1999. Issues in the statistical analysis of small area health data. Statistics in Medicine 18:2377–2399. doi:10.1002/(SICI)1097-0258(19990915/30)18:17/18{$<$}2377::AID-SIM263{$>$}3.0.CO;2-G +Wakefield J, Elliott P. 1999. Issues in the statistical analysis of small area health data. Statistics in Medicine 18:2377–2399. doi:10.1002/(SICI)1097-0258(19990915/30)18:17/18<2377::AID-SIM263>3.0.CO;2-G
Wilkinson RG. 1992. Income distribution and life expectancy. British Medical Journal 304:165–168. doi:10.1136/bmj.304.6820.165 @@ -610,7 +613,7 @@

10.1080/08898489509525409

-Yu J, Dwyer-Lindgren L, Bennett J, Ezzati M, Gustafson P, Tran M, Brauer M. 2021. A spatiotemporal analysis of inequalities in life expectancy and 20 causes of mortality in sub-neighbourhoods of Metro Vancouver, British Columbia, Canada, 1990. Health & Place 72:102692. doi:10.1016/j.healthplace.2021.102692 +Yu J, Dwyer-Lindgren L, Bennett J, Ezzati M, Gustafson P, Tran M, Brauer M. 2021. A spatiotemporal analysis of inequalities in life expectancy and 20 causes of mortality in sub-neighbourhoods of Metro Vancouver, British Columbia, Canada, 19902016. Health & Place 72:102692. doi:10.1016/j.healthplace.2021.102692

diff --git a/thesis/_thesis/Chapters/Chapter4.html b/thesis/_thesis/Chapters/Chapter4.html index 316edec..3bea180 100644 --- a/thesis/_thesis/Chapters/Chapter4.html +++ b/thesis/_thesis/Chapters/Chapter4.html @@ -280,12 +280,42 @@

\(\xi_{as}\) is an age group-spatial unit interaction term, which quantifies space-specific deviations from the overall age group structure given by \(\alpha_{2a}\). This allows different spatial units to have different age-specific mortality patterns, and each age group’s death rate to have a different spatial pattern. This interaction term was modelled as \(\mathcal{N}(0, \sigma_\xi^2)\).

\(\nu_{st}\) and \(\gamma_{at}\) allow space- and age group-specific nonlinearity in the time trends. For each spatial unit and age group, I again used first-order random walk priors with \(\nu_{s1} = \gamma_{a1} = 0\) so that the terms were identifiable.

The spatial intercepts and slopes, \(\alpha_{1s}\) and \(\beta_{1s}\), were modelled as nested hierarchical random effects. For the MSOA analysis, MSOAs were nested in districts, which were, in turn, nested in regions. For the LSOA analysis, LSOAs were nested in MSOAs, which were nested in districts. The terms for the largest spatial unit were centred on zero to allow the spatial effects to be identifiable.

-

All standard deviation parameters of the random effects had \(\sigma \sim \mathcal{U}(0, 2)\) priors. For the global intercept and slope, we used the diffuse prior \(\mathcal{N}(0, \sigma^2=10^5)\). The overdispersion parameter \(r\) had the prior \(\mathcal{U}(0, 50)\).

-

Table B.1 shows all model parameters, their priors and dimensions for the MSOA-level model in Chapter 5.

+

All standard deviation parameters of the random effects had \(\sigma \sim \mathcal{U}(0, 2)\) priors, which were used for a previous mortality modelling study by the group (Bennett et al., 2015). I performed a sensitivity analysis using the less informative \(\sigma \sim \mathcal{U}(0, 100)\) prior, to which the model was robust (the largest inferred standard deviation parameter was for the age group intercept with a mean around 0.9). For the global intercept and slope, we used the diffuse prior \(\mathcal{N}(0, \sigma^2=10^5)\). The overdispersion parameter \(r\) had the prior \(\mathcal{U}(0, 50)\).

+

Table B.1 shows all model parameters, their priors and dimensions for the MSOA-level model in Chapter 5. Table 4.1 summarises the model adequacy and consistency checks performed for the analyses.

+
+ + ++++ + + + + + + + + + + + + + + + + + + + + +
Table 4.1: Summary of model posterior checks.
Type of checkChecks performed
Model adequacyCheck all posterior death rates are between 0 and 1; scatter plots of posterior predictions of death rates against observed data by age group and year; inspection of residuals by age group and year
Model biasCompare aggregated posterior predictions of deaths with uncertainty to national number of deaths from data each year; evaluate model shrinkage by inspecting the range of life expectancy between the top and bottom percentiles (aggregating 67 or 68 MSOAs) in 2002 and 2019 estimated using the model and the data
Consistency between modelsAggregate posterior predictions of deaths from MSOA-level model to district-level and compare to posterior predictions of deaths from the same model run at district level (and same checks for LSOA- and national-level where appropriate)
+
+

Although a random walk approach has been used here to model the J-shape age-mortality association, there are a number of alternatives. For example, Gonzaga and Schmertmann (2016) use a series of linear splines over the age dimension. Alexander et al. (2017) describe an approach using the first three principal components of standard mortality curves, where the first component represents baseline mortality, and the second and third components allow offsets for higher child mortality and higher adult mortality. However, both these approaches require the modeller to manually specify either the number of basis splines and position of the knots or the number of principal components required to accurately describe the age-mortality relationship. This becomes more difficult when modelling several different diseases, which might not follow a J-shape, particularly those with a skew towards older ages such as prostate cancer. Random walks are more flexible and require less tuning in this respect, and are also used here to model age-specific slopes over time, for which we have no such prior demographic knowledge.

4.4 Inference

-

The decision was made early in my PhD research to use Markov chain Monte Carlo (MCMC) sampling methods for inference, as this is the “gold standard” with guarantees that, under mild conditions, the sequence of samples will asymptotically converge to the true posterior distribution (Roberts and Rosenthal, 2004). Furthermore, the state-of-the-art approximate inference package for spatial models, INLA, scales badly with the number of hyperparameters, and hence would struggle with the high dimensionality of the models in this thesis.

+

The decision was made early in my PhD research to use Markov chain Monte Carlo (MCMC) sampling methods for inference, as this is the “gold standard” with guarantees that, under mild conditions, the sequence of samples will asymptotically converge to the true posterior distribution (Roberts and Rosenthal, 2004). Although sampling approaches are the focus here, the R-INLA package, which uses approximate inference for latent Gaussian fields and has implementations of common spatial models, could also have been used.

Bayesian models can be specified in a probabilistic programming language. The starting point for this project was the NIMBLE package (de Valpine et al., 2022, 2017). NIMBLE uses the BUGS (“Bayesian inference Using Gibbs Sampling”) syntax for defining a hierarchical model, which my research group has a lot of experience with as WinBUGS, one of the earliest software packages for Bayesian analysis, was developed largely in the department for use on SAHSU studies. NIMBLE has an R interface but compiles models to C++ for speed and scalability. It also increases the sampling efficiency by automatically finding conjugate relationships between parameters in the model and marginalising over them wherever possible. The group also has a close relationship with the lead developer of NIMBLE.

Nevertheless, Bayesian inference is difficult to scale, and some of the models in this thesis had in excess of \(10^6\) parameters and took NIMBLE between 10 and 14 days to collect enough posterior samples. One of the main issues with NIMBLE was that the vast majority of the parameters in the model could not exploit efficient conjugate samplers, and instead used variants of basic Metropolis-Hastings samplers, which, despite numerous efforts at tuning, were inefficient. Although NIMBLE could execute a reasonable number of samples per second, the MCMC chains were struggling to explore the posterior efficiently so the effective sample size per second was low. This is a common problem in spatial and spatiotemporal models, where the parameters are correlated by design. To overcome these mixing issues, the chains had to be run for longer and thinned (i.e. take every \(n^{\text{th}}\) sample so the Markov chain samples are closer to independent, which is better for computational reasons than storing a large number of correlated samples).

In an effort to increase the sampling efficiency of the models, I tested a number of alternative probabilistic programming languages across R, python and Julia, which I have detailed in Rashid (2022). In particular, I focussed on packages that have implemented the more efficient No U-Turn Sampler (NUTS) (Hoffman and Gelman, 2014). In the end, I chose to rewrite the models in NumPyro (Phan et al., 2019) because it was the fastest and inference could be performed on a GPU, rather than CPUs, which is more performant for large models (Lao et al., 2020). The major downside was that NumPyro had not been used extensively by the spatial modelling community so I had to implement the CAR distribution from Equation 2.2 myself, which has since been contributed to the source code (NumPyro documentation, 2023). Rewriting the model in NumPyro and sampling on a GPU cut the runtime down to around a day. NumPyro also has built-in methods for approximate variational inference, such as the Laplace approximation, but these failed to converge to sensible values for these models without heavy customisation of variational function, so I stuck with sampling methods.

@@ -300,12 +330,21 @@

5.2.2 Change in life expectancy

-

Female and male life expectancy were correlated across MSOAs with a correlation coefficient of 0.87 (Figure 5.3). Female life expectancy was higher than male life expectancy in all but 15 MSOAs. The female advantage was more than 5 years in 1498 (22.1%) of 6791 MSOAs and 1–5 years in another 5187 (76.4%). From 2002 to 2019, a decline in life expectancy was more probable than an increase in 124 mostly urban MSOAs of 6791 (1.8% of all MSOAs) for women, with posterior probabilities of greater than 80% in 34 of these. The largest estimated decline of 3.0 years (0.9–5.3; posterior probability of the estimated decline being a true decline >0.99) occurred in a MSOA in Leeds (Figure 5.4, Figure 5.5).

+

Female and male life expectancy were correlated across MSOAs with a correlation coefficient of 0.87 (Figure 5.4). Female life expectancy was higher than male life expectancy in all but 15 MSOAs. The female advantage was more than 5 years in 1498 (22.1%) of 6791 MSOAs and 1–5 years in another 5187 (76.4%). From 2002 to 2019, a decline in life expectancy was more probable than an increase in 124 mostly urban MSOAs of 6791 (1.8% of all MSOAs) for women, with posterior probabilities of greater than 80% in 34 of these. The largest estimated decline of 3.0 years (0.9–5.3; posterior probability of a decline >0.99) occurred in a MSOA in Leeds (Figure 5.5, Figure 5.6).

-
Figure 5.3: Comparison of female and male life expectancy in 2019 and change from 2002 to 2019.
+
Figure 5.4: Comparison of female and male life expectancy in 2019 and change from 2002 to 2019.

-
Figure 5.4: Geography of change in life expectancy from 2002 to 2019.
+
Figure 5.5: Geography of change in life expectancy from 2002 to 2019.

-
Figure 5.5: Map of posterior probability that the estimated change represents a true increase or decrease in life expectancy from 2002 to 2019.
+
Figure 5.6: Map of posterior probability that the estimated change represents an increase in life expectancy from 2002 to 2019.
-

Elsewhere, median posterior change was positive, ranging from less than 1 year in 408 MSOAs to more than 7 years in 63 MSOAs. Posterior probability of an increase in male life expectancy was more probable than a decrease in all but one MSOA in Blackpool, in which life expectancy changed by –0.4 years (–2.3 to 1.6; posterior probability of being a true decline 0.64). For the other MSOAs, the increase ranged from less than 1 year in 31 MSOAs to more than 7 years in 114 MSOAs. The largest increases in female and male life expectancies were seen in some MSOAs in and around London (e.g., in the London Borough of Camden). In 5133 (75.6%) MSOAs, male life expectancy increased more than female life expectancy (Figure 5.3), leading to a closing of the life expectancy gap between female and male sexes.

+

Elsewhere, median posterior change was positive, ranging from less than 1 year in 408 MSOAs to more than 7 years in 63 MSOAs. Posterior probability of an increase in male life expectancy was more probable than a decrease in all but one MSOA in Blackpool, in which life expectancy changed by –0.4 years (–2.3 to 1.6; posterior probability of a decline 0.64). For the other MSOAs, the increase ranged from less than 1 year in 31 MSOAs to more than 7 years in 114 MSOAs. The largest increases in female and male life expectancies were seen in some MSOAs in and around London (e.g., in the London Borough of Camden). In 5133 (75.6%) MSOAs, male life expectancy increased more than female life expectancy (Figure 5.4), leading to a closing of the life expectancy gap between female and male sexes.

-
Figure 5.6: Distribution of MSOA life expectancies in each year from 2002 to 2019. Each point shows one MSOA. The central line shows national life expectancy.
+
Figure 5.7: Distribution of MSOA life expectancies in each year from 2002 to 2019. Each point shows one MSOA. The central line shows national life expectancy.
-

The life expectancy increase from 2002 to 2019 was smaller in MSOAs where life expectancy had been lower in 2002, and vice versa, especially for women, which led to a larger life expectancy inequality across MSOAs in 2019 than in 2002 (Figure 5.6, Figure 5.7). Specifically, the aforementioned 20.6 year (17.5–24.2) gap for women and 27.0 year (23.4–31.1) gap for men between the lowest and highest MSOA life expectancies in 2019 were larger than those in 2002 by 4.3 years (–1.3 to 9.3) for women and 7.7 years (4.0 to 11.7) for men. Similarly, the gap between the first and 99\(^{\text{th}}\) percentiles of MSOA life expectancy for women increased from 10.7 years (10.4–10.9) in 2002 to reach 14.2 years (13.9–14.5) in 2019, and for men increased from 11.5 years (11.3–11.7) in 2002 to 13.6 years (13.4–13.9) in 2019.

+

The life expectancy increase from 2002 to 2019 was smaller in MSOAs where life expectancy had been lower in 2002, and vice versa, especially for women, which led to a larger life expectancy inequality across MSOAs in 2019 than in 2002 (Figure 5.7, Figure 5.8). Specifically, the aforementioned 20.6 year (17.5–24.2) gap for women and 27.0 year (23.4–31.1) gap for men between the lowest and highest MSOA life expectancies in 2019 were larger than those in 2002 by 4.3 years (–1.3 to 9.3) for women and 7.7 years (4.0 to 11.7) for men. Similarly, the gap between the first and 99\(^{\text{th}}\) percentiles of MSOA life expectancy for women increased from 10.7 years (10.4–10.9) in 2002 to reach 14.2 years (13.9–14.5) in 2019, and for men increased from 11.5 years (11.3–11.7) in 2002 to 13.6 years (13.4–13.9) in 2019.

-
Figure 5.7: Maximum (highest) to minimum (lowest) and 99\(^{\text{th}}\) to first percentile differences in life expectancy across 6791 MSOAs, 2002–19. The large difference in 2017 is due to the low life expectancy in the MSOA where the deaths in the Grenfell Tower (Kensington and Chelsea, London) fire took place.
+
Figure 5.8: Maximum (highest) to minimum (lowest) and 99\(^{\text{th}}\) to first percentile differences in life expectancy across 6791 MSOAs, 2002–19. The large difference in 2017 is due to the low life expectancy in the MSOA where the deaths in the Grenfell Tower (Kensington and Chelsea, London) fire took place.
-

When broken down by time period, the vast majority of MSOAs saw a life expectancy increase in 2002–06 and 2006–10 (Figure 5.8). By contrast, women in 351 (5.2%) MSOAs had a median posterior change in life expectancy in 2010–14 that was negative. By 2014–19, the number of MSOAs with a negative median posterior change had risen to 1270 (18.7%) for women, with men in 784 (11.5%) MSOAs also showing a decline. These MSOAs tended to be places in which life expectancy was already low.

+

When broken down by time period, the vast majority of MSOAs saw a life expectancy increase in 2002–06 and 2006–10 (Figure 5.9). By contrast, women in 351 (5.2%) MSOAs had a median posterior change in life expectancy in 2010–14 that was negative. By 2014–19, the number of MSOAs with a negative median posterior change had risen to 1270 (18.7%) for women, with men in 784 (11.5%) MSOAs also showing a decline. These MSOAs tended to be places in which life expectancy was already low.

-
Figure 5.8: Change in MSOA life expectancy in different time periods, 2002–19. Each point shows the posterior median change in one MSOA. MSOAs are coloured by their life expectancy at the beginning of each period (e.g., for 2014–19, they are coloured by life expectancy in 2014).
+
Figure 5.9: Change in MSOA life expectancy in different time periods, 2002–19. Each point shows the posterior median change in one MSOA. MSOAs are coloured by their life expectancy at the beginning of each period (e.g., for 2014–19, they are coloured by life expectancy in 2014).

5.2.3 Life expectancy and deprivation

-

Life expectancy at birth was inversely associated with the extent of unemployment, poverty, and low education in MSOAs in 2002 and 2019 (Figure 5.9). There was substantial variation in life expectancy across MSOAs at any level of poverty or unemployment seen in the vertical spread of points in Figure 5.9. From 2002 to 2019, there were, on average, smaller gains in life expectancy in the MSOAs with the highest levels of unemployment, poverty, and low education than in those in the lowest levels, especially for women.

+

Life expectancy at birth was inversely associated with the extent of unemployment, poverty, and low education in MSOAs in 2002 and 2019 (Figure 5.10). There was substantial variation in life expectancy across MSOAs at any level of poverty or unemployment seen in the vertical spread of points in Figure 5.10. From 2002 to 2019, there were, on average, smaller gains in life expectancy in the MSOAs with the highest levels of unemployment, poverty, and low education than in those in the lowest levels, especially for women.

-
Figure 5.9: MSOA life expectancy in relation to measures of socioeconomic deprivation in the MSOA in 2002 and 2019. The socioeconomic measures are poverty, unemployment, and education. The lines show the smooth relationship fitted with locally estimated scatterplot smoothing for each year.
+
Figure 5.10: MSOA life expectancy in relation to measures of socioeconomic deprivation in the MSOA in 2002 and 2019. The socioeconomic measures are poverty, unemployment, and education. The lines show the smooth relationship fitted with locally estimated scatterplot smoothing for each year.

5.2.4 Inequalities in probability of survival

-

Similar to life expectancy, there were large inequalities in the probability of surviving from birth to 80 years, which ranged from 0.42 to 0.87 in women and 0.27 to 0.85 in men across MSOAs in 2019. These large survival inequalities were present at every stage of the life­course including childhood and early adolescence (0–14 years), young adulthood (15–29 years), working ages (30–69 years), and older ages (70–79 years) (Figure 5.10). Specifically, the probability of dying at different stages of the life­course in the 99\(^{\text{th}}\) percentile of MSOAs was between 2.6 and 3.1 times that of the first percentile for female and male sexes in 2019. From 2002 to 2019, the relative inequality across MSOAs (ie, ratio of the 99\(^{\text{th}}\) to the first percentile) in the probabilities of dying increased at every stage of the life course; the absolute inequality (ie, difference between the 99\(^{\text{th}}\) and first percentiles) decreased slightly in all combinations except for working age women (30–69 years). Within childhood and adolescence, there were particularly large inequalities in infant mortality (0 to <12 months), with a ratio of the 99\(^{\text{th}}\) to the first percentile of MSOAs being 3.2 for female and male sexes in 2019. Infant mortality increased from 2014 to 2019 in 1378 (20.3%) MSOAs for girls and 888 (13.0%) for boys, many of which experienced a decline in life expectancy.

+

Similar to life expectancy, there were large inequalities in the probability of surviving from birth to 80 years, which ranged from 0.42 to 0.87 in women and 0.27 to 0.85 in men across MSOAs in 2019. These large survival inequalities were present at every stage of the life­course including childhood and early adolescence (0–14 years), young adulthood (15–29 years), working ages (30–69 years), and older ages (70–79 years) (Figure 5.11). Specifically, the probability of dying at different stages of the life­course in the 99\(^{\text{th}}\) percentile of MSOAs was between 2.6 and 3.1 times that of the first percentile for female and male sexes in 2019. From 2002 to 2019, the relative inequality across MSOAs (ie, ratio of the 99\(^{\text{th}}\) to the first percentile) in the probabilities of dying increased at every stage of the life course; the absolute inequality (ie, difference between the 99\(^{\text{th}}\) and first percentiles) decreased slightly in all combinations except for working age women (30–69 years). Within childhood and adolescence, there were particularly large inequalities in infant mortality (0 to <12 months), with a ratio of the 99\(^{\text{th}}\) to the first percentile of MSOAs being 3.2 for female and male sexes in 2019. Infant mortality increased from 2014 to 2019 in 1378 (20.3%) MSOAs for girls and 888 (13.0%) for boys, many of which experienced a decline in life expectancy.

-
Figure 5.10: Probability of dying in specific ages in 6791 MSOAs in England in 2002 and 2019. Each point shows one MSOA. The vertical axis uses a log scale so that the large differences in survival across ages can be seen.
+
Figure 5.11: Probability of dying in specific ages in 6791 MSOAs in England in 2002 and 2019. Each point shows one MSOA. The vertical axis uses a log scale so that the large differences in survival across ages can be seen.
@@ -370,12 +377,12 @@

5.3.1 Strengths and limitations

The main strength of the analysis is the presentation of high-­resolution data for mortality and longevity across England over a period of substantial change in economic, health, and social care policy. By applying a hierarchical model based on patterns of mortality over age, space, and time, I obtained robust yearly estimates of mortality and life expectancy, together with the uncertainty in these estimates, for small areas. By contrast, studies that had not used a coherent model produced unstable (i.e., very large uncertainty) or implausible life expectancy estimates in some MSOAs, despite having aggregated deaths over 5 years, nor could they analyse trends at the MSOA level (Office for National Statistics, 2015; Public Health England, 2021). Comparison of estimates at MSOA and district level shows that the estimated MSOA life expectancy range was about 1.8 times the district-­level range for women and 2.0 times the district-­level range for men in 2019.

A limitation of the work in this chapter is that I did not break down age beyond 85 years, which might mask some differences in old­-age mortality and survival patterns. Although MSOAs have small populations and are designed to have some socioeconomic homogeneity, there are inevitable variations in socioeconomic status and health within them. To understand life expectancy inequalities in relation to individual socioeconomic characteristics requires linking health and other data such as census records, education, and taxes, as done in countries like New Zealand and Sweden.

-

The people who live in each MSOA can change due to both within­-country and international migration. Regression of the change in life expectancy from 2002 to 2019 in each MSOA against population turnover, the proportion of households in each MSOA in 2019 who were different from those who had lived there in 2002 (van Dijk et al., 2021), was not able to explain the variation in life expectancy change for women (\(R^2 < 0.001\)) or for men (\(R^2 = 0.01\)) at the national level. Studies in both the UK (Connolly et al., 2007) and USA (Ezzati et al., 2008) have also shown that migration is not sufficient to explain the trends in health and health inequalities, and that these trends are largely due to real changes in population health. Even if rising inequalities are partly due to health-­selective migration, this phenomenon has social and economic origins that should be addressed through employment opportunities, affordable housing, high-­quality education, and health care.

+

The people who live in each MSOA can change due to both within­-country and international migration. There is some evidence that migrants tend to be healthier than those who do not move (Connolly et al., 2007). Regression of the change in life expectancy from 2002 to 2019 in each MSOA against population turnover, the proportion of households in each MSOA in 2019 who were different from those who had lived there in 2002 (van Dijk et al., 2021), was not able to explain the variation in life expectancy change for women (\(R^2 < 0.001\)) or for men (\(R^2 = 0.01\)) at the national level. Studies in both the UK (Connolly et al., 2007) and USA (Ezzati et al., 2008) have also shown that migration is not sufficient to explain the trends in health and health inequalities, and that these trends are largely due to real changes in population health. Even if rising inequalities are partly due to health-­selective migration, this phenomenon has social and economic origins that should be addressed through employment opportunities, affordable housing, high-­quality education, and health care.

Population and mortality statistics in the UK are generated independently from one another. As a result, we encountered a situation of having more deaths than population in a small percentage (0.001%) of age-­MSOA­-year combinations, a phenomenon that was more common in those aged 85 years and older. This finding might be due to errors in population estimates in years between censuses or because some people (e.g., those living in long-­term care facilities such as care homes), are counted in one MSOA for the population statistic but have their death registered in another. Furthermore, care home residents might have relocated from other MSOAs, with different socioeconomic characteristics from that in which the care home is located. This factor could attenuate the association between socioeconomic variables and life expectancy. The extent of this underestimation is modest; however, because a large part of life expectancy variation is due to deaths at earlier ages, when people are less likely to live and die in care homes (Bennett et al., 2018).

5.3.2 Comparison with previous literature

-

The life expectancy estimates in specific years are similar to the snapshots presented by the ONS and Public Health England (Office for National Statistics, 2015; Public Health England, 2021), with correlation coefficients of 0.92-0.95 and mean differences of –0.004 to 0.19 years. However, these reports could not analyse trends because data were aggregated over 5 years (2009–13, 2013–17, or 2015–19). In terms of trends, studies that grouped small ­area units into deciles of deprivation have detected a decline in female life expectancy in the one or two most deprived deciles (Bennett et al., 2018; Marmot et al., 2020). By analysing trends at the MSOA level, I could identify the communities in which longevity is declining and show that the decline, which began around 2010 in women in some MSOAs, has spread and accelerated since 2014.

+

The life expectancy estimates in specific years are similar to the snapshots presented by the ONS and Public Health England (Office for National Statistics, 2015; Public Health England, 2021), with correlation coefficients of 0.92-0.95 and mean differences of –0.004 to 0.19 years. However, these reports could not analyse trends because data were aggregated over 5 years (2009–13, 2013–17, or 2015–19). In terms of trends, studies that grouped small ­area units into deciles of deprivation have detected a decline in female life expectancy in the one or two most deprived deciles (Bennett et al., 2018; Marmot et al., 2020). Boulieri and Blangiardo (2020) modelled trends in life expectancy by district and decile of deprivation, using a space-time interaction term to detect local changes in the life expectancy trend. Notably, they detected lower life expectancy than expected for the district of Leeds in 2 years of the study period for women and 4 years for men. By analysing trends at the MSOA level, I could identify the specific communities in which longevity is declining and show that the decline, which began around 2010 in women in some MSOAs, has spread and accelerated since 2014.

Congdon (2019) also used spatial models to smooth over MSOA-level data of mortality, but specifically drug-related deaths and suicides between 2012-16. The author singled out the district of Blackpool as containing many of the MSOAs with the most extreme relative risks of death from these causes, including the MSOA I found had the lowest life expectancy for men in 2019.

+
+
+
    +
  1. Unlike OAs, postcodes do not align with other administrative boundaries, nor are they designed to contain similar numbers of people, which may lead to small number issues when modelling mortality.↩︎

  2. +
+