Back to Journals » Clinical Epidemiology » Volume 10

Attributing diseases to multiple pathways: a causal-pie modeling approach

Authors Chen C, Lee WC 

Received 18 December 2017

Accepted for publication 8 March 2018

Published 27 April 2018 Volume 2018:10 Pages 489—497

DOI https://doi.org/10.2147/CLEP.S160205

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 3

Editor who approved publication: Professor Irene Petersen



Christine Chen,1 Wen-Chung Lee1,2

1Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan; 2Research Center for Genes, Environment and Human Health, College of Public Health, National Taiwan University, Taipei, Taiwan

Abstract: Characterizing the relations between exposures and diseases is the central tenet of epidemiology. Researchers may want to evaluate exposure-disease causation by assessing whether the disease under concern is induced by the various exposures – the so-called “attribution”. In this paper, the authors propose a method to attribute diseases to multiple pathways based on the causal-pie model. The method can also be used to evaluate the potential impact of an intervention strategy and to allocate responsibility in tort-law liability issues.

Keywords: epidemiological methods, attribution, disease pathways, causal-pie model

 

Introduction

Characterizing the relations between exposures and diseases is the central tenet of epidemiology. Epidemiologists may be interested in knowing the influence of a single exposure on a disease (using effect measures such as risk difference, risk ratio, and odds ratio) or the total influence of multiple exposures on the disease. They may also be interested in knowing any possible interaction between exposures. Through epidemiological studies, the complex relations between multiple exposures and a disease can be clarified.1

Attention has also been given to the “processes”, “pathways”, or “mechanisms” themselves, through which an exposure brings about the disease. For example, one may want to know whether the causal relationship between an exposure and a disease is mediated by a specific “mediator”. If so, the influence of the exposure on the disease can be decomposed: the “indirect effect” is the effect mediated by the mediator, and the “direct effect” is the one not mediated by it. A statistical method that can decompose the exposure effect is structural equation modeling (SEM).24 Effect decomposition in SEM is straightforward; the effect pertaining to a specific pathway is simply the product of the path coefficients of the traveled paths. For the indirect effect, we sum up the effects of those pathways that pass through the mediator, and for the direct effect, those that do not pass through it. However, this only works for a continuous mediator and continuous disease. The methods proposed by Robins and Greenland,5 Pearl,6 and VanderWeele7 are more general. These methods can accommodate any variable type and can cope with exposure–mediator interactions and nonlinear relations between variables.

The aforementioned methods evaluate exposure–disease causation by going from exposures to a disease. Sometimes we may be interested in backward induction by assessing whether the disease under concern is induced by the various exposures – the so-called “attribution”. (Note that the attribution here is based on epidemiologic data,814 and should not be confused with attribution in social psychology where the human perception of causations is in focus.)15 For example, when planning intervention strategies, policymakers may want to compare the effectiveness of various intervention programs directed at removing different exposures in the population. In this case, we need to know the proportion of disease that was induced by each exposure. As another example, in some tort litigation, the court is concerned about the contribution of a specific exposure to the disease occurrence of the plaintiff. If probabilistic apportionment of causal responsibility16,17 is adopted, the court needs to know the probability that the occurrence of the disease was induced by this exposure. In situations like these, we can use indices such as the attributable fraction812 and the causal-pie weight13,14 for attribution. When there are multiple exposures, a summation of the attributable fractions for all exposures may exceed 100%. Clearly, this makes no sense, and the index needs some rectifications.1823 When there are multiple exposures, one can compute a panel of causal-pie weights (summing up to 100%) for the individual effects of each and every exposure as well as the interactive effects between them. However, neither the attributable fraction nor the causal-pie weight takes disease pathways into account.

In this paper, we propose a method to attribute diseases to multiple pathways based on the causal-pie model.1,24 The method can also be used to evaluate the potential impact of an intervention strategy and to allocate responsibility in tort-law liability issues.

Methods

Relations between an exposure, a mediator, and a disease

A “directed acyclic graph” (DAG)1,25,26 is used to depict the causal relations between an exposure (E) and a disease (D), which can be mediated by a mediator (M) (Figure 1). Causality (also referred to as causation, or cause and effect) is a process (arrows in Figure 1) that connects one set of variables (the “causes” or “risk factors”) with another set of variables (the “effects” or “outcomes”), where the first is partly responsible for the second, and the second is partly dependent on the first. An effect (outcome) can, in turn, be a cause (risk factor) for many other effects (outcomes). Note that a DAG depicts a simplified biology, ignoring any feedback loop where an effect can feed back to the same cause that leads to the very effect in the first place.

Figure 1 The two paths for M-stage and four paths for D-stage.

Abbreviations: D, disease; E, exposure; M, mediator.

We consider the exposure, the disease, and the mediator as dichotomous variables. We call the occurrence of the mediator, the M-stage, such as the paths M1 and M2, and the occurrence of the disease, the D-stage, such as the paths D1, D2, D3, and D4 (Figure 1). Note that to indicate “interaction”, we allow two DAG arrows to meet and merge before pointing at the same variable, such as the D4 path in Figure 1.

A causal-pie model for mediator and disease

We follow the causal-pie framework for mediator and disease proposed by Hafeman.27 We invoke the “sufficient-cause positive monotonicity assumption” at the individual level, that is, the effects of the exposure on the mediator and on the disease as well as the effect of the mediator on the disease, if any, can only be harmful and cannot be preventive for any individual.1,13,14,27,28 In Figure 2, under the assumption, there is a total of six classes of causal pies – two for the M-stage and four for the D-stage. The two causal-pie classes for the M-stage are: 1) a causal-pie class not containing E as its component and 2) a causal-pie class containing E as its component (1 and 2 correspond to paths M1 and M2 in Figure 1, respectively). The four causal-pie classes for the D-stage are: 3) a causal-pie class containing neither E nor M as its component, 4) a causal-pie class containing E but not M as its component, 5) a causal-pie class containing M but not E as its component, and 6) a causal-pie class containing both E and M as its components (3, 4, 5, and 6 correspond to paths D1, D2, D3, and D4 in Figure 1, respectively).

Figure 2 The total six causal-pie classes for M-stage and D-stage.

Notes: (A) M-stage; (B) D-stage.

Abbreviations: D, disease; E, exposure; M, mediator; U, unknown components.

Aside from the exposure and the mediator, each causal-pie class contains a distinct constellation of unknown components. We denote these by U – the UM1, UM2, UD1, UD2, UD3, and UD4, respectively, in Figure 2. When all components in a causal pie appear, the causal pie is completed, and the corresponding mediator or disease is meant to occur. The arrivals of the unknown components (U) are random events. When the U of a particular causal pie arrives and other component(s) (E, M, or both), if any, in the causal pie all exists, the causal pie is completed, and as mentioned previously, the corresponding mediator or disease occurs. Otherwise, the U departs, and the completion of this causal pie is contingent on the events that the same U arrives again.

Disease pathways

An individual can follow the paths depicted in Figure 1 to become diseased. A total of six distinct disease pathways can thus be identified (Figure 3):

  1. The exposure causes the disease directly (D2).
  2. The exposure causes the mediator, which in turn causes the disease (M2D3).
  3. The exposure causes the mediator, and then both interact to cause the disease (M2D4).
  4. The exposure and an exogenous mediator interact to cause the disease (M1D4).
  5. An exogenous mediator causes the disease directly (M1D3).
  6. Neither the exposure nor the mediator causes the disease (D1).

Figure 3 The six disease pathways.

Notes: (A) Pathway I; (B) Pathway II; (C) Pathway III; (D) Pathway IV; (E) Pathway V; (F) Pathway VI.

Abbreviations: D, disease; E, exposure; M, mediator.

Note that these pathways are examined at the individual level rather than population, that is, the same person is followed-up throughout for the path(s) he/she had actually taken to eventually become diseased. Therefore, we can distinguish the actual “operation” from the mere “presence” of a mediator.29,30 Pathways IV and V (Figure 3) only imply the presence of an “M” in the disease pathway. On the other hand, Pathways II and III (Figure 3) not only imply the presence of the M but also its operation (a person goes from E to M, and then to D), and the M now can truly be called a “mediator”.

Estimation of the causal-pie parameters

We assume that in the follow-up period, the arrival rates of U in the six classes of causal pies, denoted by λM1, λM2, λD1, λD2, λD3, and λD4, respectively, are constant. We also invoke the “no redundancy assumption”,28,31,32 that is, for each and every subject in the population, at most one U can arrive in a sufficiently short time interval.

One can conduct a cohort study to estimate the aforementioned six causal-pie parameters – λM1, λM2, λD1, λD2, λD3, and λD4. Suppose that there are n exposed subjects and m unexposed subjects in the cohort. At the start of the follow-up (t=0), all the subjects are mediator- and disease-free. During the follow-up period (from t=0 to t=T), for subjects who contracted the disease, the researcher records their mediator status at the moments they contracted the disease. For subjects who did not contract the disease during the following period, the researcher records their mediator status at the end of the follow-up (t=T). A tally of subjects at the end of the follow-up is shown in Table 1. This dataset has a total of 6 degrees of freedom (22 - 1=3 for the exposed subjects and 22 - 1=3 for the unexposed), which is equal to the number of the unknown parameters. Therefore, λM1, λM2, λD1, λD2, λD3, and λD4 are just identifiable. See Supplementary materials for details of the estimation procedure.

Table 1 A tally of subjects at the end of the follow-up of a cohort study

Notes: Inside the box are the diseased subjects. n and m represent number of subject.

Abbreviations: D, disease; E, exposure; M, mediator.

Attribution, a backward induction process

As pointed out earlier, attribution is a backward induction process, assessing whether the outcome under concern is induced by some variables. Thus, we reverse the direction of the usual DAG arrows in Figure 1 to become the “attribution arrows” (Figures 4 and 5). When an attribution arrow points at a variable (exposure or mediator), it means that the indicated variable is one cause of the outcome (disease or mediator, depending on the point from which the arrow originates). When an attribution arrow points at the exposure and the mediator simultaneously, it means that the exposure and the mediator interact to cause the disease. When an attribution arrow points at nothing, it means that neither the exposure nor the mediator is a cause of the disease (or the mediator).

Figure 4 Formulae for attribution.

Abbreviations: D, disease; E, exposure; M, mediator.

Given the six causal-pie parameters, we can compute the probability for any path (Figure 4). Consider the M-stage first (begin with the “M” in Figure 4 and follow the attribution arrows), an unexposed subject who acquires the mediator during the follow-up can only acquire it through path M1 (probability=1) but not path M2 (probability=0). An exposed subject who acquires the mediator can acquire it either through path M1 or M2, but not both (because of the no redundancy assumption). By Bayes theorem (Supplementary materials), the probabilities are (path M1) and (path M2), respectively. Next, consider the D-stage (begin with the “D” in Figure 4 and follow the attribution arrows) and also apply the Bayes theorem. An unexposed subject who acquires the disease but not the mediator during the follow-up can do so only through path D1 (probability=1). An unexposed subject who acquires the disease and the mediator can take either path D1 or D3 (with probabilities and , respectively). An exposed subject who acquires the disease but not the mediator can take either path D1 or D2 (with probabilities and , respectively). An exposed subject who acquires the disease and the mediator can take either path D1, D2, D3, or D4 (with probabilities , , , and , respectively).

Now we can compute the probability for any pathway. First, we note that under the no redundancy assumption, no one can acquire the disease and the mediator at the same time. A subject who acquires both the disease and the mediator during the follow-up must acquire the mediator before acquiring the disease. To calculate the probability for a pathway that straddles an M-stage path and a D-stage path, we simply multiply the two corresponding probabilities for the two paths. Following this multiplication rule, we can attribute the disease to multiple pathways probabilistically for a diseased subject with known exposure and mediator status. For a subject with unknown exposure and/or mediator status or for all the diseased subjects in the population, we can use the cell counts inside the box in Table 1 as the weights (shown underneath each panel in Figure 4) for attribution.

Next, we discuss attribution from three different perspectives: 1) attributing diseases to multiple pathways, 2) evaluating the potential impact of an intervention strategy, and 3) allocating responsibility in tort-law liability issues.

Attributing diseases to multiple pathways

We can attribute the diseases in the population to the aforementioned six pathways. The population attributable fractions (PAF), which take into account all the diseased subjects in the population, are:


(1)


(2)


(3)


(4)


(5)

and


(6)

for Pathways I, II, …, VI, respectively. It is worth noting that the six PAFs sum to one.

Evaluating the potential impact of an intervention strategy

We now consider the impact of a specific intervention. We note that if an intervention can block a segment of a pathway (for example, either path M2 or D3, but not necessarily both, of Pathway II), the whole pathway is blocked. To calculate the impact fraction for an intervention, we sum the PAFs for those pathways that are blocked by this intervention.

The impact fractions for a number of interventions are detailed: 1) a complete removal of the exposure from the population: this would block paths M2, D2, and D4 and therefore Pathways I, II, III, and IV. The impact fraction for this intervention is PAFI + PAFII + PAFIII + PAFIV. 2) A complete obstruction of the exposure effect on the mediator: this would block path M2 and therefore Pathways II and III. The impact fraction for this intervention is PAFII + PAFIII. 3) A complete obstruction of the mediator effect on the disease: this would block paths D3 and D4 and therefore Pathways II, III, IV, and V. The impact fraction of this intervention is PAFII + PAFIII + PAFIV + PAFV .

Allocating responsibility in tort-law liability issues

As pointed out earlier, if probabilistic apportionment of causal responsibility is adopted for tort-law liability issues,16,17 the court needs to know the probability that the occurrence of the disease was induced by the particular exposure. We can follow the attribution arrow(s) of a pathway and examine whether the arrow points at the exposure to decide whether the exposure is involved in the pathway. If the attribution arrow of the disease points at the exposure and the mediator simultaneously, the probability that the exposure is involved is taken to be 0.5 (since there is no further information about which path is more likely to be actually taken). But when the attribution arrow of the mediator points again at the exposure, it is then known for certain that the exposure is involved somewhere in the causal chain. Using these rules, the full attributable fractions for Pathways I, II, and III, a half of the attributable fraction for Pathway IV, and none for Pathways V and VI are allocated to the exposure, respectively.

To be precise, for an exposed subject who contracts the disease, the contribution of the exposure to his/her disease – the “attributable fraction among the exposed” (AFE) – is as follows: 1) if the subject does not acquire the mediator during the follow-up (the “E=1, M=0” panel in Figure 4),


(7)

2) if the subject acquires the mediator during the follow-up (the “E=1, M=1” panel in Figure 4),


(8)

and 3) if the mediator status of the subject is unknown,


(9)

Example

We use Richiardi et al’s33 cohort data (m1=9900, m2=490, m3=100, m4=10, n1=4850, n2=800, n3=150, and n4=200, using the notations in Table 1) as an example. For this dataset, using Robins and Greenland’s5 and Pearl’s6 methods, we can decompose the total effect of the exposure on the disease (0.048) into direct effect (0.028) and indirect effect (0.020). Using VanderWeele’s7 method, we can further decompose the total effect into four components: controlled direct effect (0.02), reference interaction (0.008), mediated interaction (0.019), and pure indirect effect (0.001). However, we cannot accomplish attribution using these previous methods.

We use the present method to analyze the data (R code in Supplementary materials). The estimates of causal-pie parameters are as follows: , , , , , and , respectively (Richiardi et al33 did not mention the duration of the follow-up in their paper; as such, we assume T=1, and Supplementary materials show that assuming different Ts will cause the six to change according to a constant proportion and thus, the estimates of the attributable fractions remain the same). Figure 5 presents the path probabilities.

Figure 5 Disease attributions for the example cohort.

Abbreviations: D, disease; E, exposure; M, mediator.

The PAFs for the six pathways are as follows: PAFI=22.8%, PAFII=2.2%, PAFIII=27.8%, PAFIV=10.0%, PAFV=2.4%, and PAFVI=34.7%, respectively. The total sum of the six PAFs is 22.8%+2.2%+27.8%+10.0%+2.4%+34.7%=100.0%.

The impact fraction for a complete removal of the exposure from the population is 62.8%, for a complete obstruction of the exposure effect on the mediator is 30.0%, and for a complete obstruction of the mediator effect on the disease is 42.4%.

For an exposed subject who contracts the disease, if the subject does not acquire the mediator during the follow-up, AFEM=0=64.7%; if the subject acquires the mediator during the follow-up, AFEM=1=84.5%; and if the mediator status of the subject is unknown, AFE=76.0%.

Discussion

In this paper, we invoke three assumptions for the causal-pie model. The first assumption is the monotonicity assumption.1,13,14,27,28 Without this assumption, the number of the causal-pie classes (12; 3 for the M-stage and 9 for the D-stage) will be larger than the degrees of freedom of the data (6), which makes the causal-pie parameters non-identifiable. Researchers who use the present method should have prior knowledge that the effects of the exposure on the mediator and on the disease and the effect of the mediator on the disease are “monotonic”. To be precise, neither the “no exposure” nor the “no mediator” can be a component of any causal pie. Second, we assume that the arrival rates of the U’s are constant in the follow-up period. When the follow-up time is not too long (for example, less than 5 years), the assumption is reasonable or approximately so. The third assumption is the no redundancy assumption.28,31,32 This is a Poisson-like assumption, which is weaker than the assumption of independent competing causes.13,14,34 Even though two causal-pie classes have overlapping components, the assumption still holds if the overlapping components are not the last one arriving. In addition, the assumption only specifies at most one arrival event of the U’s in an infinitesimally short time interval. Non-rarity of the mediator or the disease for the entire follow-up period by itself does not necessarily imply the violation of the no redundancy assumption.

Controlling for confounding is essential in observational studies. One can stratify the data by the confounders and compute the attributable fractions for each and every stratum. One then uses the count of the diseased subjects in each stratum as the weight to pool the results. This will yield “adjusted” attributable fractions. The present method can also be extended to accommodate other variable types or more general situations. If the exposure or the mediator is multilevel (a continuous variable can be categorized into a multilevel one for an approximation; but caution should be exercised as this may create bias)35 – for example, the exposure has a total of k1 levels and the mediator, a total of k2 levels – under the monotonicity assumption there will be a total of k1 × k2 causal-pie classes for the D-stage. Furthermore, if the disease has a total of k3 subtypes, each with a total of k1 × k2 causal-pie classes, then there will be a total of k1 × k2 × k3 causal-pie classes. In addition, if there are multiple exposures or multiple mediators (an exposure-induced mediator-disease confounder3638 can be viewed as another mediator; Figure 6), the total number of causal-pie classes will be even larger. It seems rather complex. But if one can conduct a large-scale cohort study and use appropriate statistical models, such as a multistate model,3942 the many causal-pie parameters (or the state transition rates, using the terminology of a multistate model) can be amenable to estimation. Then, one simply follows the present method for attribution.

Figure 6 An exposure-induced mediator-disease confounder as another mediator.

Abbreviations: C, confounder; D, disease; E, exposure; M, mediator.

Last but not least, the causal-pie model by itself deserves careful scrutiny. Like the DAG, a causal-pie model depicts an overtly simplified biology. But unfortunately, a direct biological modeling of exposure-disease relations considering all physical or chemical reactions among exposures, their metabolites, or their reaction products within individuals is seldom feasible. Previously, Siemiatycki and Thomas43 and Thompson44 held a pessimistic view that there is a limit of biological inference from epidemiologic data, since a number of very dissimilar mechanisms or models for disease development can often fit the same data equally well. Recently, an emerging interdisciplinary science, the molecular pathological epidemiology (MPE), has come into focus.4547 MPE uses molecular pathology tools to dissect disease pathways and mechanisms at molecular, individual, and population levels. Casting the causal-pie model in the MPE framework is a promising future research direction.

Acknowledgments

This paper is partly supported by grants from the Ministry of Science and Technology, Taiwan (MOST 105-2314-B-002-049-MY3 and MOST 104-2314-B-002-118-MY3). No additional external funding was received for this study. The funder had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Disclosure

The authors report no conflicts of interest in this work.

References

1.

Rothman KJ, Greenland S, Lash TL, editors. Modern Epidemiology. 3rd ed. Philadelphia, PA: Lippincott Williams & Wilkins; 2008.

2.

Bollen KA. Structural Equations with Latent Variables. New York, NY: John Wiley & Sons; 1989.

3.

Kaplan D. Structural Equation Modeling: Foundations and Extensions. 2nd ed. Thousand Oaks, CA: SAGE; 2009.

4.

Kline RB. Principles and Practice of Structural Equation Modeling. 4th ed. New York, NY: Guilford; 2015.

5.

Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3(2):143–155.

6.

Pearl J. Direct and Indirect Effects: Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, 2001. San Francisco, CA: Morgan Kaufmann; August 2 - 5, 2001; Seattle, Washington.

7.

VanderWeele TJ. A unification of mediation and interaction: a 4-way decomposition. Epidemiology. 2014;25(5):749–761.

8.

Cole P, MacMahon B. Attributable risk percent in case-control studies. Brit J Prev Soc Med. 1971;25(4):242–244.

9.

Miettinen OS. Proportion of disease caused or prevented by a given exposure, trait or intervention. Am J Epidemiol. 1974;99(5):325–332.

10.

Walter SD. The estimation and interpretation of attributable risk in health research. Biometrics. 1976;32(4):829–849.

11.

Bruzzi P, Green SB, Byar DP, Brinton LA, Schairer C. Estimating the population attributable risk for multiple risk factors using case-control data. Am J Epidemiol. 1985;122(5):904–914.

12.

Benichou J. A review of adjusted estimators of attributable risk. Stat Methods Med Res. 2001;10(3):195–216.

13.

Liao SF, Lee WC. Weighing the causal pies in case-control studies. Ann Epidemiol. 2010;20(7):568–573.

14.

Lee WC. Completion potentials of sufficient component causes. Epidemiology. 2012;23(3):446–453.

15.

Kelley HH, Michela JL. Attribution theory and research. Ann Rev Psychol. 1980;31:457–501.

16.

Wright RW. Causation in tort law. California Law Review. 1985;73:1735–1828.

17.

Robinson GO. Probabilistic causation and compensation for tortious risk. J Legal Stud. 1985;14:779–798.

18.

Eide GE, Gefeller O. Sequential and average attributable fractions as aids in the selection of preventive strategies. J Clin Epidemiol. 1995;48(5):645–655.

19.

Land M, Gefeller O. A game-theoretic approach to partitioning attributable risks in epidemiology. Biom J. 1997;39:777–792.

20.

Land M, Vogel C, Gefeller O. Partitioning methods for multifactorial risk attribution. Stat Methods Med Res. 2001;10(3):217–230.

21.

McElduff P, Attia J, Ewald B, Cockburn J, Heller R. Estimating the contribution of individual risk factors to disease in a person with more than one risk factor. J Clin Epidemiol. 2002;55(6):588–592.

22.

Llorca J, Delgado-Rodrı́guez M. A new way to estimate the contribution of a risk factor in populations avoided nonadditivity. J Clin Epidemiol. 2004;57(5):479–483.

23.

Rabe C, Lehnert-Batar A, Gefeller O. Generalized approaches to partitioning the attributable risk of interacting risk factors can remedy existing pitfalls. J Clin Epidemiol. 2007;60(5):461–468.

24.

Rothman KJ. Causes. Am J Epidemiol. 1976;104(6):587–592.

25.

Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10(1):37–48.

26.

Pearl J. Causality: models, reasoning, and inference. 2nd ed. New York, NY: Cambridge University Press; 2009.

27.

Hafeman DM. A sufficient cause based approach to the assessment of mediation. Eur J Epidemiol. 2008;23(11):711–721.

28.

Suzuki E, Yamamoto E, Tsuda T. On the relations between excess fraction, attributable fraction, and etiologic fraction. Am J Epidemiol. 2012;175(6):567–575.

29.

VanderWeele TJ. Mediation and mechanism. Eur J Epidemiol. 2009;24(5):217–224.

30.

Suzuki E, Yamamoto E, Tsuda T. Identification of operating mediation and mechanism in the sufficient-component cause framework. Eur J Epidemiol. 2011;26(5):347–357.

31.

Gatto NM, Campbell UB. Redundant causation from a sufficient cause perspective. Epidemiol Perspect Innov. 2010;7:5.

32.

Lee WC. Assessing causal mechanistic interactions: a peril ratio index of synergy based on multiplicativity. PLoS One. 2013;8(6):e67424.

33.

Richiardi L, Bellocco R, Zugna D. Mediation analysis in epidemiology: methods, interpretation and bias. Int J Epidemiol. 2013;42(5):1511–1519.

34.

Robins JM, Greenland S. Estimability and estimation of excess and etiologic fractions. Stat Med. 1989;8(7):845–859.

35.

VanderWeele TJ, Chen Y, Ahsan H. Inference for causal interactions for continuous exposures under dichotomization. Biometrics. 2011;67(4):1414–1421.

36.

Vansteelandt S, VanderWeele TJ. Natural direct and indirect effects on the exposed: effect decomposition under weaker assumptions. Biometrics. 2012;68(4):1019–1027.

37.

Tchetgen Tchetgen EJ, Vanderweele TJ. Identification of natural direct effects when a confounder of the mediator is directly affected by exposure. Epidemiology. 2014;25(2):282–291.

38.

Vanderweele TJ, Vansteelandt S, Robins JM. Effect decomposition in the presence of an exposure-induced mediator-outcome confounder. Epidemiology. 2014;25(2):300–306.

39.

Kalbfleisch J, Lawless JF. The analysis of panel data under a Markov assumption. J Am Stat Assoc. 1985;80(392):863–871.

40.

Kay R. A Markov model for analysing cancer markers and disease states in survival studies. Biometrics. 1986;42(4):855–865.

41.

Jackson CH. Multi-state models for panel data: the msm package for R. J Stat Softw. 2011;38(8):1–28.

42.

Welton NJ, Ades AE. Estimation of Markov chain transition probabilities and rates from fully and partially observed data: uncertainty propagation, evidence synthesis, and model calibration. Med Decis Making. 2005;25(6):633–645.

43.

Siemiatycki J, Thomas DC. Biological models and statistical interactions: an example from multistage carcinogenesis. Int J Epidemiol. 1981;10(4):382–387.

44.

Thompson WD. Effect modification and the limits of biological inference from epidemiologic data. J Clin Epidemiol. 1991;44(3):221–232.

45.

Ogino S, Chan AT, Fuchs CS, Giovannucci E. Molecular pathological epidemiology of colorectal neoplasia: an emerging transdisciplinary and interdisciplinary field. Gut. 2011;60(3):397–411.

46.

Ogino S, Nishihara R, VanderWeele TJ, et al. The role of molecular pathological epidemiology in the study of neoplastic and non-neoplastic diseases in the era of precision medicine. Epidemiology. 2016;27(4):602–611.

47.

Richiardi L, Barone-Adesi F, Pearce N. Cancer subtypes in aetiological research. Eur J Epidemiol. 2017;32(5):353–361.

Creative Commons License © 2018 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.