Using Routinely Collected Electronic Healthcare Record Data to Investigate Fibrotic Multimorbidity in England

Background Electronic healthcare records (EHRs) are used to document diagnoses, symptoms, tests, and prescriptions. Though not primarily collected for research purposes, owing to the size of the data as well as the depth of information collected, they have been used extensively to conduct epidemiological research. The Clinical Practice Research Datalink (CPRD) is an EHR database containing representative data of the UK population with regard to age, sex, race, and social deprivation measures. Fibrotic conditions are characterised by excessive scarring, contributing towards organ dysfunction and eventual organ failure. Fibrosis is associated with ageing as well as many other factors, it is hypothesised that fibrotic conditions are caused by the same underlying pathological mechanism. We calculated the prevalence of fibrotic conditions (as defined in a previous Delphi survey of clinicians) as well as the prevalence of fibrotic multimorbidity (the proportion of people with multiple fibrotic conditions). Methods We included a random sample of 993,370 UK adults, alive, and enrolled at a UK general practice, providing data to the CPRD Aurum database as of 1st of January 2015. Individuals had to be eligible for linkage to hospital episode statistics (HES) and ONS death registration. We calculated the point prevalence of fibrotic conditions and multi-morbid fibrosis on the 1st of January 2015. Using death records of those who died in 2015, we investigated the prevalence of fibrosis associated death. We explored the most commonly co-occurring fibrotic conditions and determined the settings in which diagnoses were commonly made (primary care, secondary care or after death). Results The point prevalence of any fibrotic condition was 21.46%. In total, 6.00% of people had fibrotic multimorbidity. Of the people who died in 2015, 34.82% had a recording of a fibrotic condition listed on their death certificate. Conclusion The key finding was that fibrotic multimorbidity affects approximately 1 in 16 people.


Introduction
Fibrosis can affect any organ, and fibrotic conditions are characterised by excessive, uncontrolled deposition of extracellular matrix in the affected site, which in turn alters the tissue extracellular environment, leading to organ failure. 1 There is growing evidence that high levels of inflammatory factors, such as IL14, are associated with an increased risk of idiopathic pulmonary fibrosis (IPF), with potential to target inflammatory factors to potentially prevent and treat fibrosis. 2,35][6] There is currently no cure for fibrotic diseases, and prognosis is poor and often comparable to end-stage cancer. 7It has been suggested that fibrotic conditions are associated with 45% of deaths in the industrialised world, therefore leading to significant healthcare burden; however, although this statistic is consistently quoted in fibrotic research, the source data for this estimate are unknown. 8As such, fibrotic diseases constitute a large research realm, with the possibility of shared mechanisms across multiple conditions.
Multimorbidity is defined as when a patient has two or more chronic health conditions. 9It is known that multimorbidity increases burden on healthcare systems as well as patients, therefore it is important to understand potential drivers of multimorbidity as well as phenotypes of patients most likely to be affected by multi-morbid conditions. 10We have adaptively termed "fibrotic multimorbidity" as the co-occurrence of 2 or more fibrotic conditions.
We previously conducted a Delphi study to gather clinical consensus on which diseases are fibrotic and identified 256 diseases which exhibit fibrotic manifestations. 11In this study, we defined and applied code lists for each of these diseases, to routinely collected electronic healthcare record data to determine the prevalence of fibrotic conditions as well as fibrotic multimorbidity.
The main objectives of this work were to calculate the prevalence of fibrotic conditions in the UK using the EHRs of a random sample of adults, and subsequently we looked to understand how many people in the sample suffered from multiple fibrotic conditions (fibrotic multimorbidity).
To assess the likelihood of co-occurring fibrotic multimorbidity, we focussed subsequent analysis on three single-organ fibrotic diseases of pulmonary fibrosis, liver cirrhosis, and urinary tract fibrosis as codes for these conditions would always indicate the presence of fibrosis as these conditions are characterised by fibrosis no matter the stage of progression.
We believe this to be the first piece of work investigating the co-occurrence of fibrotic conditions using large, detailed, routinely collected, representative healthcare data, highlighting the large prevalence of fibrotic conditions in the UK population.

Data Source
We used the Clinical Practice Research Datalink (CPRD) Aurum, a nationally representative database of anonymised primary care electronic healthcare records.CPRD Aurum contains the data of around 23% of the UK population and is representative of the UK population with regard to age, sex and deprivation. 12Clinical information is entered with SNOMED CT (systematised nomenclature of medicine-clinical terms) codes.Linked mortality data from the Office for National Statistics (ONS), and secondary care data from Hospital Episode Statistics (HES) were provided for this study by CPRD.

Study Population
Beginning with a random sample of 1 million UK adults, we derived a random sample of 993,370 people who were eligible to be included in the analysis (the data of 6630 people was removed as they did not contribute at least 1 year of data).The patients had to be enrolled at the general practice before the 1st of January 2014 and still enrolled at the GP on the 1st of January 2015, they also had to be eligible for linkage to HES and ONS.

Outcomes of Interest
Fibrotic conditions were previously defined in a Delphi study consisting of three survey rounds using clinical consensus as an indicator of whether a condition was fibrotic. 2The following organ/disease groups were used: pulmonary fibrosis, urinary tract fibrosis, liver cirrhosis, diabetes, blood vessel fibrosis, atherosclerosis, bile duct fibrosis, cardiomyopathy, integumentary fibrosis, intestinal/pancreatic fibrosis, lymphatic fibrosis, peritoneal fibrosis, reproductive fibrosis, skeletal fibrosis, systemic fibrosis, and fibrosis of the cardiac valves.SNOMED CT codes were used to identify these conditions in primary care data whilst ICD 10 codes were used to identify diagnoses from both HES and ONS data sources.These codelists are available on Github: https://github.com/NHLI-Respiratory-Epi/Fibrotic-multimorbidity

Statistical Methods
We initially described the baseline characteristics (age and sex), of each of the subpopulations diagnosed with fibrotic conditions and reported the point prevalence of these conditions as well as the point prevalence of multi-morbid fibrosis as of the 1st of January 2015.We also calculated the proportion of deaths in 2015 which had a record of a fibrotic condition within one of the positions of cause of death (ONS records can contain up to 15 codes which can be used to detail conditions which contributed towards cause of death).
Using the data of people who died in 2015, we investigated the location that diagnoses were made.We defined the location as either a primary care setting (if the earliest date was found in the CPRD observation file), a secondary care setting (if the earliest date was found in a HES admitted patient care file), or post-mortem (if the diagnosis was only found in one of the codes for any of the positions in the cause of death).The codes present in ONS death records are in the form of ICD 10 codes, the same coding system used by hospitals.Therefore, we applied the same codelists to the ONS records that we used in the HES records.
We then focused on three subpopulations: people with a diagnosis of pulmonary fibrosis, those with urinary tract fibrosis and those with liver cirrhosis.These conditions were selected as they were deemed to always be fibrotic and were therefore best poised to be included in analysis of fibrotic multimorbidity as codes for these conditions would always identify the presence of fibrosis.Confidence intervals of 99% were applied to support robust associations.Logistic regression was performed adjusting for age (quartiles) and sex to investigate the relationship between one of the three selected fibrotic conditions of interest (exposures) and the other 13 fibrotic conditions (outcomes).We explored the five most commonly occurring fibrotic conditions, both prior and post diagnosis of each of these conditions, and reported the median time between these.Finally, we calculated the odds of being diagnosed with each of the fibrotic conditions, given a diagnosis of one of pulmonary fibrosis, urinary tract fibrosis, or liver cirrhosis, respectively, for example, the odds of being diagnosed with reproductive fibrosis given a diagnosis of pulmonary fibrosis.
All analyses were performed using Stata version 17.If the prevalence of a fibrotic condition in the random sample of 1 million was less than 0.05%, it was removed from the analysis to preserve anonymity.

Results
Of the 993,370 people included in this study, the median age was 46 years and 50.26% of the population were female.The median follow-up time for all patients was 10 years (3963 days, IQR: 1875-8268).Cardiac cirrhosis, cardiac fibrosis, fibrosis of the spleen, fibrosis of the nervous system, appendix fibrosis, and oral fibrosis were not included in the analysis as the prevalence of these conditions was less than 0.05%.
The overall point prevalence of fibrotic diseases in the entire sample was 21.46% (99% CI: 21.35-21.56).As of the 1st of January 2015, the median age of people with prevalent fibrotic disease was 65 years (IQR: 52-77) and 53.32% of those with at least one fibrotic condition were female (Supplementary Table 1).In total, 15.46% of people had a record of one fibrotic disease whilst 6.00% of people had two or more fibrotic conditions in their records and therefore had fibrotic multimorbidity (Table 1).
Of those within the sample who died in 2015 (n = 21,857), 34.82% had a record of at least one fibrotic condition in at least one of the positions of cause of death, whilst 7.85% had two fibrotic conditions present on their death certificate.Of the people who died in 2015, 6.89% had a fibrotic condition recorded as their primary cause of death.Of 7611 people who had a fibrotic condition listed on their death record, 19.42% (n = 1478) did not have a record of a fibrotic condition in their GP or hospital records.
Most fibrotic conditions (72.83%) were diagnosed in either primary or secondary care settings (Figure 2 and Supplementary Table 3).However, of the people with a record of cardiomyopathy 80.50% of their diagnoses of cardiomyopathy were first reported on their death certificate, similarly of the people with pulmonary fibrosis, 82.06% of the diagnoses were first reported on death certification; of the 301 people who died in 2015 and were diagnosed with  lymphatic fibrosis, 79.40% of the diagnoses were first recorded on death certificates.Of the diabetes diagnoses, 89.62% were diagnosed in a primary care setting whilst 65.11% of skeletal fibrosis diagnoses were also made in primary care.Most diagnoses of atherosclerosis, biliary fibrosis, and fibrosis of the intestines/pancreas were made in secondary care (97.11%, 73.22% and 86.31%, respectively).There were significantly greater odds of being diagnosed with one of systemic fibrosis, lymphatic fibrosis and peritoneal fibrosis when comparing people with and without a diagnosis of pulmonary fibrosis (Unadjusted analyses: OR systemic: 9.66, 99% CI: 7.55-12.36,OR lymphatic: 8.07, 99% CI: 5.55-11.73,OR peritoneal: 3.68, 99% CI: 1.79-7.58),(Adjusted analyses: ORadj: 6.16, 99% CI: 4.94-7.69,ORadj: 3.78, 99% CI: 2.65-5.38 and ORadj: 3.28, 99% CI: 3.28, 99% CI: 1.98-5.46,respectively) (Figure 3).A significantly greater odds of being diagnosed with peritoneal fibrosis was observed when comparing people diagnosed with urinary tract fibrosis compared with those not diagnosed with urinary tract fibrosis (OR: 15.32, 99% CI: 12.32-19.06;ORadj: 12.04, 99% CI: 9.53-15.21).Significantly greater odds of being diagnosed with bile duct fibrosis were identified when comparing people diagnosed with liver cirrhosis with people who did not have a diagnosis of liver cirrhosis in their records (OR: 9.41, 99% CI: 8.31-10.66;ORadj: 7.95, 99% CI: 7.10-8.90).
A total of 1741 (28.74%) people diagnosed with pulmonary fibrosis and another fibrotic condition and were included in analysis of the 5 most commonly co-occurring fibrotic conditions, both prior and post diagnosis (Supplementary Figure 1).Post-diagnosis, the most common condition was fibrosis of the intestines/pancreas which 101 people were diagnosed with, the median time between these diagnoses was 2.1 years (IQR: 0.3-5.2).The most common diagnosis prior to pulmonary fibrosis was skeletal fibrosis (n = 397) the median time between these diagnoses was 7.5 years (IQR: 2.3-15.9).Using the earliest entry date of diagnoses of fibrotic conditions, the proportions of diagnoses being made in primary and secondary care settings as well as on death certificate was calculated.This analysis used the data of people who died in 2015.
A total of 9827 (29.99%) people had a record of urinary tract fibrosis and another fibrotic condition (Supplementary Figure 2).The most commonly occurring fibrotic condition after diagnosis of urinary tract fibrosis was intestinal/pancreatic fibrosis (n = 815) for which the median time between diagnoses was 1.5 years (IQR: 0.0-3.9).The most diagnosed fibrotic condition prior to urinary tract fibrosis was diabetes (n = 3641), the median time between diagnoses was 8.1 years (IQR: 3.3-14.4).
Of the 2604 (26.95%) people who had a diagnosis of liver cirrhosis and another fibrotic condition in their records, diabetes was the most diagnosed fibrotic condition both before and after diagnosis of liver cirrhosis (Supplementary Figure 3).In total 353 people were diagnosed with diabetes after being diagnosed with liver cirrhosis, the median time between these diagnoses was 0.6 years (IQR: 0.0-4.0).A total of 667 people were diagnosed with diabetes prior to being diagnosed with liver cirrhosis.The median time between these diagnoses was 4.4 years (IQR: -1.6-9.0).

Discussion
Using routinely collected electronic healthcare records, we have determined the prevalence of fibrotic conditions in a random sample of adults in England, as of 2015.Of the 993,370 people included in this analysis, 21.46% had at least one diagnosis of a fibrotic condition, whilst 6.00% of the cohort had two or more diagnoses of fibrotic conditions and therefore fibrotic multimorbidity.The most prevalent fibrotic condition was found to be diabetes.Investigation of the setting in which conditions were first diagnosed identified that diabetes diagnoses were commonly made in primary care as well as diagnoses of skeletal fibrosis, conversely diagnoses of atherosclerosis and intestinal/pancreatic fibrosis were made in secondary care.We found that of the people in the sample that died in 2015, 34.82% had a recording of a fibrotic condition listed on their death certificate, whilst 6.89% of deaths were due to a fibrotic condition (as recorded in the primary cause of death).The United States Government estimated that around 45% of deaths in the USA were 'attributed to fibrotic disorders'; 8 we have shown that of a random sample of deaths in England in 2015, 34.82% of the records of these people included a code for a fibrotic condition in at least one of the positions of cause of death.
We demonstrated that the odds of being diagnosed with an additional fibrotic condition were greater for people with a fibrotic condition compared with those without, using three example conditions (of pulmonary fibrosis, urinary tract fibrosis, or liver cirrhosis).
The prevalence of fibrotic conditions derived from this random sample of adults is similar to previously recorded prevalence values; for instance, we identified the prevalence of diabetes in our cohort as 9.19%, whilst Diabetes UK reported the prevalence in the UK as of 2015 being 9%. 13This demonstrates that the result of our analysis is likely to be representative of the prevalence of fibrotic conditions in England.However, in some instances, it is harder to directly compare the derived prevalence as the groups were mostly defined based on the affected organ; intestinal/pancreatic fibrosis was found to be prevalent in 5.64% of the sample and this broad definition including conditions such as Crohn's diseases, colitis, inflammatory bowel disease and pancreatitis.Crohn's and colitis are estimated to effect 0.81% of the population. 14The prevalence of irritable bowel disease in the UK has been previously found to be 1.42% in 2016 whilst the prevalence of pancreatitis (determined using UK Biobank) was found to be 0.163%. 15Integumentary fibrosis was found in our sample to be prevalent in 2.80%; previous research has found that Raynaud's phenomenon (included diseases in our definition of integumentary fibrosis) affects around 5% of the population. 16It has been previously found that lung damage and microvascular complications are associated with systemic sclerosis. 17One in ten women in the UK of reproductive age suffers from endometriosis, 18 and reproductive fibrosis which included endometriosis was found to be recorded in 1.90% of the random sample of UK adults.
Many of the conditions identified in the previous Delphi survey are not idiopathic and instead are a result of chronic organ inflammation. 9This adds complexity when trying to understand the co-occurrence of these conditions, as it may not be due to an underlying "fibrotic" mechanism but instead randomness, other ageing processes or other causes which are yet to be identified.As we used routinely collected electronic healthcare records, we were unable to study genetic markers or biomarkers.As we used electronic healthcare records, we conducted a Delphi survey of clinicians as these are the professionals who enter the data into the records.We have not studied this from a pathological angle, a condition that a clinician may deem to be fibrotic and could be contested by a pathologist.

Non-Clinical Implications
We have shown that 1 in 5 adults in the UK have fibrotic conditions, this is a significant statistic which highlights the burden of fibrotic conditions on both patients and healthcare systems.As a result, it is important that resources be sufficiently allocated for the treatment and management of these patients in both primary and secondary care services.Also, we found that 1 in 16 people had multiple fibrotic conditions.This indicates that there could be overlapping and potentially contradicting care plans in place and risking polypharmacy, and therefore, we encourage care providers to conduct multidisciplinary team meetings between specialities to provide optimal care.

Clinical Implications
We have shown that many of the fibrotic conditions were first reported on death certification rather than in primary or secondary care records, and it is highly likely that the conditions are not being diagnosed and are therefore untreated.As we have shown greater odds in being diagnosed with other fibrotic conditions in people with a diagnosis of either pulmonary fibrosis, urinary tract fibrosis, and liver cirrhosis, it is important to understand what factors increase the risk of developing multiple fibrotic conditions such as genetics or medications.As such, it is important to consider other fibrotic condition diagnoses and investigations in patients with one of these three conditions.

Strengths and Limitations
CPRD Aurum data is known to be representative of the UK population with respect to age, sex, ethnicity, and social deprivation.This database is derived from entries that GPs create, however free text is not available, we triangulated this data by linking the data with hospitalisations and death records (where possible) to gain the most detailed possible account of people's health.The prevalence estimates provided here were derived from a random sample of 993,370 adults and are therefore estimates of the prevalence of these diseases within the British population.As some of the conditions were first reported on the death certificate, we have likely underestimated the prevalence of conditions on the 1st of January 2015 as it is possible that these underlying undiagnosed conditions would have been found in these people at that point in time if investigations had occurred.We have used broad definitions of fibrotic disease, sub-grouped based on the affected organs, therefore these data cannot be interpreted for individual conditions, however, we have been able to descriptively analyse the data, which is available using broad overarching methodologies.The fibrotic conditions which were investigated were defined by clinicians, however some of these conditions may only be fibrotic in the later stages of the disease, however it was not possible to define the stages/severity of disease in this analysis.When applying logistic regression models, both age and sex were adjusted for, however we acknowledge that fibrotic conditions could have a multitude of comorbidities as well as other contributing factors including and not limited to smoking status and alcohol consumption which could confound the result.However, as we were analysis groups of fibrotic conditions, it was not possible to account for all possible confounders.

Conclusion
Using a large sample of UK adults, we have investigated the prevalence of fibrotic conditions in electronic healthcare records as well as the prevalence of fibrotic multimorbidity.We found that as of 2015, the prevalence of fibrotic conditions in a random sample of UK adults was 21.46%, and the prevalence of fibrotic multimorbidity was 6.00%.We have analysed information of those within the cohort who died in 2015, of which 34.82% had at least one fibrotic condition recorded on their death certification.Using three cohorts we showed that there is an increased chance of being diagnosed with a fibrotic condition if you already have a fibrotic condition.We believe this to be the first piece of work investigating the co-occurrence of fibrotic conditions using large, routinely collected, representative healthcare data, highlighting the large prevalence of fibrotic conditions in the UK population as well as the co-occurrence of fibrotic conditions.It is important that fibrotic conditions be diagnosed at the earliest opportunity, with diagnoses being made in life so that interventions such as treatment or palliation can be accessed at the earliest point.

Data Sharing Statement
Data may be obtained from a third party and are not publicly available.Linked pseudonymised mortality data from the Office for National Statistics (ONS) and secondary care data from Hospital Episode Statistics (HES) were provided for this study by CPRD for patients in England.Data are linked by NHS Digital, the statutory trusted third party for linking data, using identifiable data held only by NHS Digital.Select general practices consent to this process at a practice level, with individual patients having the right to opt-out.Use of HES and ONS data is Copyright © (2018), reused with the permission of The Health & Social Care Information Centre, all rights reserved.Data are available on request from the CPRD.Their provision requires the purchase of a license, and this license does not permit the authors to make them publicly available to all.This work used data from the version collected in May 2022 and has clearly specified the data selected in each Methods section.To allow identical data to be obtained by others, via the purchase of a license, the code lists will be provided upon request.Licenses are available from the CPRD (http://www.cprd.com):The Clinical Practice Research Datalink Group, The Medicines and Healthcare products Regulatory Agency, 10 South Colonnade, Canary Wharf, London E14 4PU.

Figure 1
Figure 1Barplot of prevalence of fibrotic conditions.The prevalence of fibrotic conditions derived from both CPRD and HES admitted patient care, displayed alongside a 99% confidence interval.

Figure 2
Figure 2Location of diagnoses.Using the earliest entry date of diagnoses of fibrotic conditions, the proportions of diagnoses being made in primary and secondary care settings as well as on death certificate was calculated.This analysis used the data of people who died in 2015.

Figure 3
Figure 3 Forest plot of odds ratio of being diagnosed with a fibrotic condition.Using logistic regression, the log-odds ratio of having a record of a fibrotic condition, given a record of: (A) pulmonary fibrosis, (B) urinary tract fibrosis, (C) liver cirrhosis.

Table 1
Number of Fibrotic ConditionsThe number of fibrotic conditions experienced by the people in the cohort, using CPRD and HES Admitted Patient Care data. Notes: