Under-recording of hospital bleeding events in UK primary care: a linked Clinical Practice Research Datalink and Hospital Episode Statistics study
Received 6 April 2018
Accepted for publication 7 June 2018
Published 4 September 2018 Volume 2018:10 Pages 1155—1168
Checked for plagiarism Yes
Review by Single anonymous peer review
Peer reviewer comments 3
Editor who approved publication: Professor Henrik Sørensen
Laura McDonald,1,* Cormac J Sammon,2,* Mihail Samnaliev,2 Sreeram Ramagopalan1
1Centre for Observational Research and Data Sciences, Bristol-Myers Squibb, Uxbridge, UK; 2PHMR, Berkeley Works, London, UK
*These authors contributed equally to this work
Background: Primary care databases represent a rich source of data for health care research; however, the quality of recording of secondary care events in these databases is uncertain. This study sought to investigate the completeness of recording of hospital admissions for bleeds in primary care records and explore the impact of incomplete recording on estimates of bleeding risk associated with antithrombotic treatment.
Methods: The study population consisted of adults with non-valvular atrial fibrillation who had at least one bleed recorded in either the Clinical Practice Research Datalink (CPRD) or Hospital Episode Statistics (HES) while receiving prescriptions for an oral anticoagulant. The proportion of bleeds recorded in HES that had a corresponding bleed recorded in the subsequent 12 weeks in CPRD was calculated, and factors associated with having a corresponding record were identified. Cox proportional hazards analyses investigating the hazard of subsequent bleeding associated with antithrombotic treatment were carried out using linked CPRD-HES data and using CPRD only data, and the results were compared.
Results: Less than 20% of the 14,361 bleeds recorded in the HES data had a corresponding bleed coded in the CPRD in the subsequent 12 weeks. This proportion varied by bleed characteristics, calendar time, day of week of admission (weekday vs weekend) and oral anticoagulant treatment at the time of the bleed. The hazard of subsequent bleeding associated with vitamin K antagonists (VKAs) and antiplatelet agents (APAs) relative to no antithrombotic treatment were similar using the linked primary and secondary care dataset (VKA HRadj 1.06 CI95 0.96–1.16; APA HRadj 1.08 CI95 0.96–1.21) and the unlinked primary care data (VKA HRadj 1.12 CI95 1.01–1.24; APA HRadj 1.06 CI95 0.95–1.20).
Conclusion: Secondary care bleeding events are not completely recorded in primary care records and under-recording may be differential with respect to a variety of factors, including antithrombotic treatment. While the impact of under-recording on estimates of the comparative safety of antithrombotic drugs was limited, the extent of the under-recording suggests its potential impact should be considered, and ideally evaluated in future studies utilizing stand-alone primary care data.
Keywords: real-world data, data linkage, comparative effectiveness, secondary care, atrial fibrillation
Within the UK National Health Service (NHS), services which typically act as the first point of contact with the health care system are referred to as “primary care” and include general practitioners (GPs), dentists, pharmacists and optometrists. Within the NHS, the GP also plays the role of gatekeeper, managing referral to most non-emergency secondary (hospital and community) and tertiary (highly specialized) health care services. As a result, the majority of the UK population are registered with a GP and the GP record is the patient’s primary medical record.1 In line with this, guidelines indicate that the details of secondary care encounters should be routinely communicated to an individual’s GP practice in order to allow for these details to be recorded and ensure continuity of care.2 Databases containing data collected in UK primary care have therefore been widely used as a stand-alone resource for research into medical conditions and the drugs used to treat them.3
More recently, the linkage of English secondary and primary care datasets has facilitated the conduct of studies exploring the extent to which secondary care events are coded in primary care records. A number of these studies have found coding to be suboptimal, with 17% of cancers, 34% of GI bleeds, 21% of myocardial infarctions, 22% of poisoning events and 9% of fractures recorded in the linked dataset not appearing in the primary care record.4–7 These results suggest the use of primary care records as a stand-alone source for research into these conditions is unsuitable and may generate bias.
In order to explore the potential for UK primary care databases to generate real world evidence (RWE) on the safety and effectiveness of antithrombotic treatment, this study investigated the extent to which secondary care bleeds are coded in primary care records among a cohort of individuals with non-valvular atrial fibrillation (NVAF). The study also sought to understand the impact of incomplete recording on estimates of bleeding risk associated with antithrombotic treatment.
The study was carried out using a linked Clinical Practice Research Datalink (CPRD) – Hospital Episode Statistics (HES) dataset. This dataset combines anonymized medical-record data for patients registered with participating GPs in England (the CPRD dataset) with details of their admissions to NHS hospitals (the HES dataset). The linked dataset therefore includes longitudinal information on diagnoses, symptoms, laboratory tests and prescriptions issued by the GP in addition to information on referrals to specialists, hospital admission diagnoses, hospital procedures and deaths.8 Clinical events in the CPRD are recorded using the “Read code” clinical coding system. Hospital discharge diagnoses in HES are recorded using the international classification of disease (ICD)–10 clinical coding system. Greater than 98% of the UK population are registered with a GP and individuals registered with a GP must opt out of data collection in order to be excluded from the CPRD dataset. Despite over-representing certain geographical areas of the UK, the CPRD has been found to be representative of the UK population with regard to sex, age and ethnicity.8 HES captures information on all NHS hospital admissions occurring in England and on admissions to independent sector providers if funded by the NHS (est. 98–99% of hospital activity).9
Recording of secondary care bleeds in primary care data
The study population consisted of all adults with a diagnosis of atrial fibrillation recorded in the CPRD or HES who had at least one clinically relevant bleed recorded in either data source between first January 2003 and 31 January 2016 while receiving prescriptions for oral anticoagulant (OAC) treatment. Individuals with codes indicating their atrial fibrillation was valvular were excluded as despite sharing the same electrophysiological abnormality, the differing etiology of this valvular atrial fibrillation warrants the separate consideration of such individuals. Code lists defining atrial fibrillation, valvular conditions and clinically relevant bleeds are provided in the data supplement (Tables 1–6).
Table 1 ICD codes used to identify individuals with atrial fibrillation
Table 2 ICD codes used to identify and exclude individuals whose atrial fibrillation was valvular in nature
Table 3 Read codes used to identify individuals with atrial fibrillation
Abbreviations: ECG, electrocardiogram; NEC, not elsewhere classified; H/O, history of; NOS, not otherwise specified.
Within this population, all clinically relevant bleeding events recorded in the HES and the CPRD were identified using relevant diagnostic codes and classified according to the location in the body in which they occurred (Tables 5 and 6). We refer to “clinically relevant bleeds” to distinguish these from minor bleeds which are non-clinically consequential; such bleeds are not captured by our data source. The proportion of bleeds recorded in HES that had a corresponding record in the CPRD in the subsequent 12 weeks was calculated, overall and stratified by bleed location.
To identify factors associated with a HES bleed having a corresponding bleeding record coded in the CPRD in the subsequent 12 weeks, generalized estimating equations (GEE) binary regression analysis was performed. The GEE analysis used a binomial distribution, a logit-link and an exchangeable correlation structure to account for the inclusion of repeat bleeds per individual. Bleed characteristics considered in the analysis included OAC treatment at the time of the bleed, bleed type, calendar period, period of week of bleed occurrence (weekday vs weekend). A range of patient characteristics were also considered for inclusion in the model, including age, sex, deprivation (English Index of Multiple Deprivation),10 body mass index (BMI), stroke risk factors (history of stroke/TIA, systemic thromboembolism, congestive heart failure, vascular disease, hypertension, diabetes, CHA2DS2-VASc score), bleeding risk factors (bleeding history, liver disease, renal disease, modified HAS-BLED score) and concomitant medical treatment.
Impact of recording completeness on comparative safety of antithrombotic treatment
In order to further explore the impact under-recording of HES bleeds in primary care data can have on comparative safety and effectiveness analyses, a comparative safety analysis was carried out using two different data sources: a linked CPRD-HES data (linked analysis) and a CPRD only dataset (unlinked analysis). The analysis investigated the impact of using the different data sources on the relative hazard of subsequent bleeding across antithrombotic treatment strategies, within a population of individuals who had suffered a first bleed while using OACs.
For this analysis, the study population consisted of adults with a diagnosis of atrial fibrillation recorded in the CPRD or HES who had a clinically relevant bleed (index bleed) recorded in either data source between 1 January 2003 and 15 March 2012 which occurred while receiving prescriptions for an OAC. Patients were followed from index bleed until the earliest of either 15 March 2012, the date of leaving the database, or the date of death. Prescriptions for vitamin K antagonists (VKAs) or antiplatelet agents (APAs) issued following the first bleed were identified and used to stratify each individuals’ follow-up time into one of three antithrombotic treatment groups: VKA treatment, APA treatment, no antithrombotic treatment. Gaps in treatment of up to 60 days between two prescriptions from the same treatment group were considered to constitute continuous treatment. Cox proportional hazard regression models were used to compare the hazard of subsequent bleeding events across treatment groups in each population, including treatment group as a time varying covariate and controlling for the same patient and bleed characteristics outlined for the GEE analysis above. Hazard ratios are reported along with Wald 95% confidence intervals.
All analyses were carried out in [SAS/STAT] software (SAS Institute Inc., Cary, NC, USA).
A total of 14,361 bleeds recorded in HES were identified among patients with NVAF receiving OAC treatment between 2003 and 2016. The proportion of HES bleeds with a corresponding bleed recorded in the CPRD increased from 12.5% in the first week following the HES bleed to 19.6% after 12 weeks (Table 7). Similar results, stratified by the location of the bleed, are provided in Table 8. A greater proportion of respiratory, intraarticular and intracranial bleeds had a consistent bleed code recorded in the CPRD within 12 weeks (30.1%, 40.7% and 39.2%, respectively) compared to bleeds in other locations, including GI bleeds (13.5%) and intraspinal bleeds (11.6%).
Table 7 HES bleeds with a corresponding bleed recorded in the CPRD in the subsequent 12 weeks
Abbreviations: HES, hospital episode statistics; CPRD, Clinical Practice Research Datalink.
Patient characteristics in the linked and unlinked datasets are shown in Table 9. The results of the GEE regression model are provided in Table 10. Of the 14,361 bleeds recorded in HES, intracranial bleeds, bleeds resulting in weekend hospital admission, bleeds occurring longer ago, bleeds occurring during OAC treatment and bleeds occurring in individuals without a history of bleeding risk factors were more likely to have a corresponding bleed recorded in the CPRD in the 12 weeks after hospital admission.
After applying inclusion and exclusion criteria, 5,197 individuals were identified for inclusion in the Cox regression analyses using CPRD data only (Figure S1) and 7,063 individuals were identified for inclusion in the analysis using CPRD-HES linked data (Figure S2). On average, the population identified using linked CPRD-HES data was slightly older than the population identified using unlinked data only, and contained a greater proportion of females, individuals more recently diagnosed with NVAF, individuals with a history of stroke and bleeding risk factors and individuals with evidence of active cancer (Table 9). The index bleeds identified in the linked population occurred more recently and were more severe than those in the unlinked population, with a greater proportion of gastrointestinal and intracranial bleeds identified (Table 9).
Figure 1 shows the cumulative incidence of bleeding in the unlinked primary care data and the linked primary and secondary care dataset. Adjusting for statistically significant differences in the above characteristics across treatment groups within each population, we found that the hazard of subsequent bleeding associated with VKAs and APAs relative to no antithrombotic treatment were 12% and 6% higher, respectively, when using the unlinked primary care data (VKA HRadj 1.12 CI95 1.01–1.24; APA HRadj 1.06 CI95 0.95–1.20) and were 6% and 8% higher, respectively, when using the linked primary and secondary care dataset (VKA HRadj 1.06 CI95 0.96–1.16; APA HRadj 1.08 CI95 0.96–1.21).
Figure 1 Cumulative incidence of bleeding in the unlinked primary care data (A) and the linked primary and secondary care dataset (B).
This study found that the coding of hospital bleeds in the primary care record was incomplete, with less than 20% of individuals with an inpatient diagnosis for a bleed having a bleed coded in their primary care record in the subsequent 12 weeks. Moreover, differences with respect to key clinical and demographic characteristics were observed between patients identified from primary care vs linked data. While under-recording was found to be differential with regard to a number of factors, including OAC treatment, the incomplete recording of bleeds in primary care was not found to considerably bias estimates of the risk of bleeding associated with antithrombotic treatment.
The low proportion of secondary care bleeds having a corresponding bleed recorded in primary care indicates that as much as 80% of such bleeds could be excluded from a study which utilized primary care data only to identify bleeds. Using primary care data alone will therefore result in false-negative misclassification of exposure, outcome and/or covariate status. The impact of such misclassification is unpredictable and dependent on the study question. While our stratified and GEE analyses suggest that incompleteness varies by a range of factors including OAC treatment, calendar time and bleed location/type, our comparative safety analyses investigating the risk of subsequent bleeding associated with antithrombotic treatment illustrates that for certain study questions the impact on estimates of comparative safety or effectiveness may be small. Despite this, given the extent of under-recording and observed differences in patient characteristics, potential bias introduced through differential misclassification by these and other factors should be taken into consideration in interpreting the results of studies which have used primary care data only to identify bleeds11,12 and in the planning of future studies.
Of GP practices contributing to the CPRD, 57% are eligible for linkage with HES, and no individuals registered with Scottish, Welsh or Northern Irish practices are eligible.13 As a result, the use of a HES linked CPRD dataset can have a considerable impact on the generalizability and sample size available for a given study. Given our observation that the impact of under-recording on relative measures of safety or effectiveness can be limited, the decision to use unlinked CPRD vs HES-linked CPRD data must be made on a study specific basis, based on a comparison of the anticipated value that the HES data can add against the reduction in sample size and generalizability it enforces. Based on the extent of under-recording of secondary care bleeding events in primary care data reported here, and the finding that the HR of subsequent bleeding for VKAs compared to no antithrombotic treatment was slightly higher when using unlinked CPRD data, we suggest that for studies in which bleeding is a key variable, HES linked data is used; at a minimum, to illustrate that findings in the HES-linked data are similar to those in the unlinked data.
Our finding that the odds of a HES bleed having a corresponding CPRD bleed has decreased over time (Table 10) is notable as it suggests that the quality of recording in primary care datasets has decreased over time. This is an interesting finding as it suggests recent efforts to improve and standardize the communication of discharge details between secondary and primary care (eDischarge summaries,2 have yet to make an impact. There is a possibility that the decrease in recording over time may represent a change in recording practices rather than a decrease in the quality of recording, as we used specific Read codes related to a bleed in the CPRD to assess consistency with HES data; however, there may have been other Read codes recorded that suggest a bleed occurred (eg, a code for a medical condition for which bleeding is a common symptom). A previous study investigating recording of upper gastrointestinal bleeds in the CPRD and HES included a range of “probable” and “possible” bleed Read codes and found supporting evidence for a much higher percentage of HES bleeds in the CPRD (66%).5 Further, in clinical practice, some Read codes may have “free text” information recorded against them confirming a bleed occurred. These “free text” data consist of unstandardized text which can be used to elaborate on the information contained in the Read code. Free text data are not currently made available for research purposes; however, they are available to individuals involved in the clinical care of patients. While the information contained in related Read codes and the free text may therefore confirm bleeds in some of the cases we have identified, given the magnitude of uncoded secondary care events it is likely that a clinically relevant proportion of individuals did not have their bleed recorded anywhere in their primary care record. These findings are in line with those of a number of studies that have identified shortcomings in communication during transition of care between secondary and primary care and which have highlighted the safety issues that may result from them.14–21 From a research perspective, the unavailability of free text and non-specificity of the “possible” and “probable” codes included by Crooks et al5 mean that neither represent feasible approaches to identifying bleeding events in stand-alone primary care data and the high proportions of unreported data we report remain relevant.
The observation that the odds of a HES bleed having a corresponding CPRD bleed is higher for bleeds admitted at the weekend is of interest given the publicity surrounding so-called “weekend effects” in the UK, whereby individuals admitted to hospital at the weekend are more likely to have poor outcomes. It may be possible that admission for bleeds at weekends are more likely to be recorded in the CPRD due to their association with poorer outcomes and therefore being more clinically relevant. Previous methodological work exploring the accuracy of HES data for exploring weekend effects has found that events recorded in HES data on weekdays are more likely to be prevalent events inappropriately recorded as incident events and that this may partly explain the better outcomes observed following these events.22 Our finding that HES bleeds admitted on weekdays are less likely to have a corresponding bleed record in the CPRD may therefore reflect the fact that a greater proportion of the weekday admissions are not being recorded by GPs as they are not truly incident bleeds.
Beyond the weekend effect, the potential for inaccurate recording of incident events in HES is an important consideration in interpreting our findings, as thus far we have considered HES to represent a “gold standard” for recording of secondary care events and any events not recorded in the CPRD to represent under-recording in primary care. Inaccuracy in HES coding has been reported previously for a number of event types; however, since the Payment by Results system was introduced in 2004 the average accuracy of coding has been reported to be 96.0% (interquartile range: 89.3–96.2%), P=0.020).23 Notably, this figure has been derived across a range of types of event and most of the studies contributing to this figure focused on the accuracy of ICD coding at the four digit ICD code level. This latter point is important as most of the bleeding ICD codes we have investigated would still have been captured as bleeds had they been miscoded at the four digit level but not at the three digit level. While some of the 80% of secondary care events not coded in the CPRD may therefore not have been true incident bleeds, we believe it is unlikely that a substantial proportion were. An additional limitation of our study is that it explores only the sensitivity of recording in primary care, but does not explore the specificity. In utilizing the CPRD to investigate bleeding events it is important that the potential for false positive classification of bleeds is given consideration.
A further limitation is that our descriptive analyses do not account for extended hospital stays and deaths. That is, 9% of individuals were not discharged from hospital within the 12 weeks following their index bleed. Such individuals may therefore have supporting evidence recorded later, upon discharge from hospital. Removing undischarged individuals from the denominator has a minimal impact on results, increasing the proportion with supporting evidence recorded to 21.5%. Among the 14,361 individuals with an index bleed, 16% died during the 12 week follow-up. While individuals who died during the 12 week follow-up do not have the same opportunity to have supporting evidence recorded, this is still notable from a methodological point of view as a study using primary care data may not capture bleeds presenting in secondary care and resulting in deaths within 12 weeks.
Our results add to the evidence base suggesting secondary care events are not completely recorded in primary care records, and further that under-recording of bleeding events is differential with respect to a variety of factors, including treatment. While the impacts of under-recording on estimates of the comparative safety of antithrombotic drugs obtained from stand-alone primary care data were small, the extent of the under-recording suggests its potential impact should be considered, and ideally evaluated in future studies utilizing stand-alone primary care data.
SR and LM are full-time employees of Bristol-Myers Squibb, and SR is a shareholder of Bristol-Myers Squibb. CJS and MS are full-time employees of PHMR, PHMR received financial support for the work described in this manuscript from Bristol-Myers Squibb. The authors report no other conflicts of interest in this work.
Health and Social Care Information Centre. Attribution Data Set GP-Registered Populations Scaled to ONS Population Estimates – 2011; 2012. Available from: http://www.hscic.gov.uk/catalogue/PUB05054. Accessed January 25, 2018.
NHS England » Transfer of Care – eDischarge. Available from: https://www.england.nhs.uk/digitaltechnology/info-revolution/interoperability/transfer-of-care-edischarge/. Accessed November 19, 2017.
Vezyridis P, Timmons S. Evolution of primary care databases in UK: a scientometric analysis of research output. BMJ Open. 2016;6(10):e012785.
Williams R, Gallagher A, van Staa T, Hammad T, Leufkens B, de Vries F. Cancer recording in patients with type 2 diabetes in primary care and hospital admission data. Int J Popul Data Sci. 2017;1(1):314.
Crooks CJ, Card TR, West J. Defining upper gastrointestinal bleeding from linked primary and secondary care data and the effect on occurrence and 28 day mortality. BMC Health Serv Res. 2012;12(1):392.
Herrett E, Shah AD, Boggon R, et al. Completeness and diagnostic validity of recording acute myocardial infarction events in primary care, hospital care, disease registry, and national mortality records: cohort study. BMJ. 2013;346:f2350.
Baker R, Orton E, Tata LJ, Kendrick D. Measurement of the incidence of poisonings, fractures, and burns in children and young people with linked primary and secondary care data: a population-based cohort study. Lancet. 2014;384:S19.
Herrett E, Gallagher AM, Bhaskaran K, et al. Data Resource Profile: Clinical Practice Research Datalink (CPRD). Int J Epidemiol. 2015;44(3):827–836.
Herbert A, Wijlaars L, Zylbersztejn A, Cromwell D, Hardelid P. Data Resource Profile: Hospital Episode Statistics Admitted Patient Care (HES APC). Int J Epidemiol. 2017;46(4):1093–1093i.
The English Indices of Deprivation 2015 – Frequently Asked Questions (FAQs); 2016. Available from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/579151/English_Indices_of_Deprivation_2015_-_Frequently_Asked_Questions_Dec_2016.pdf. Accessed May 18, 2018.
Hollowell J, Ruigómez A, Johansson S, Wallander MA, García-Rodríguez LA. The incidence of bleeding complications associated with warfarin treatment in general practice in the United Kingdom. Br J Gen Pract. 2003;53(489):312-4. Accessed November 30, 2017. http://pubmedcentralcanada.ca/pmcc/articles/PMC1314574/pdf/12879832.pdf.
Scowcroft AC, Lee S, Mant J. Thromboprophylaxis of elderly patients with AF in the UK: an analysis using the General Practice Research Database (GPRD) 2000-2009. Heart. 2013;99(2):127–132.
Data Access - CPRD Linked Data. Available from: https://www.cprd.com/dataAccess/linkeddata.asp. Accessed May 18, 2018.
van Walraven C, Taljaard M, Bell CM, et al. A prospective cohort study found that provider and information continuity was low after patient discharge from hospital. J Clin Epidemiol. 2010;63(9):1000–1010.
van Walraven C, Seth R, Austin PC, Laupacis A. Effect of discharge summary availability during post-discharge visits on hospital readmission. J Gen Intern Med. 2002;17(3):186–192.
Bench S, Cornish J, Xyrichis A. Intensive care discharge summaries for general practice staff: a focus group study. Br J Gen Pract. 2016;66(653):e904–e912.
Kripalani S, Lefevre F, Phillips CO, Williams MV, Basaviah P, Baker DW. Deficits in communication and information transfer between hospital-based and primary care physicians. JAMA. 2007;297(8):831.
Moore C, Mcginn T, Halm E. Tying up loose ends. Arch Intern Med. 2007;167(12):1305.
Cooper A, Edwards A, Williams H, et al. Sources of unsafe primary care for older adults: a mixed-methods analysis of patient safety incident reports. Age Ageing. 2017;46(5):833–839.
Bain A, Nettleship L, Kavanagh S, Babar ZU. Evaluating insulin information provided on discharge summaries in a secondary care hospital in the United Kingdom. J Pharm Policy Pract. 2017;10(1):25.
NHS England Patient Safety Domain. Review of National Reporting and Learning System (NRLS) Incident Data Relating to Discharge from Acute and Mental Health Trusts; 2014. Available from: https://www.england.nhs.uk/wp-content/uploads/2014/08/nrls-summary.pdf. Accessed November 21, 2017.
Li L, Rothwell PM; Oxford Vascular Study. Biases in detection of apparent “weekend effect” on outcome with administrative coding data: population based study of stroke. BMJ. 2016;353:i2648.
Burns EM, Rigby E, Mamidanna R, et al. Systematic review of discharge coding accuracy. J Public Health. 2012;34(1):138–148.
© 2018 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.