Back to Journals » Clinical Epidemiology » Volume 12

Validity of an Automated Algorithm to Identify Cirrhosis Using Electronic Health Records in Patients with Primary Biliary Cholangitis

Authors Lu M, Bowlus CL , Lindor K, Rodriguez-Watson CV , Romanelli RJ, Haller IV, Anderson H , VanWormer JJ, Boscarino JA , Schmidt MA, Daida YG , Sahota A, Vincent J, Li J, Trudeau S, Rupp LB, Gordon SC

Received 8 July 2020

Accepted for publication 23 September 2020

Published 10 November 2020 Volume 2020:12 Pages 1261—1267


Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Dr Eyal Cohen

Mei Lu,1 Christopher L Bowlus,2 Keith Lindor,3 Carla V Rodriguez-Watson,4 Robert J Romanelli,5 Irina V Haller,6 Heather Anderson,7 Jeffrey J VanWormer,8 Joseph A Boscarino,9 Mark A Schmidt,10 Yihe G Daida,11 Amandeep Sahota,12 Jennifer Vincent,13 Jia Li,1 Sheri Trudeau,1 Loralee B Rupp,14 Stuart C Gordon15 On Behalf of the FOLD Investigators

1Department of Public Health Sciences, Henry Ford Health System, Detroit, MI, USA; 2University of California Davis School of Medicine, Sacramento, CA, USA; 3College of Health Solutions, Arizona State University, Phoenix, AZ, USA; 4Center for Health Research Kaiser Permanente Mid-Atlantic Research Institute, Rockville, MD; Reagan-Udall Foundation for the FDA, Washington, DC, USA; 5Palo Alto Medical Foundation Research Institute, Palo Alto, CA, USA; 6Essentia Institute of Rural Health, Essentia Health, Duluth, MN, USA; 7Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO, USA; 8Marshfield Clinic Research Foundation, Marshfield, WI, USA; 9Department of Population Health Sciences, Geisinger Clinic, Danville, PA, USA; 10Center for Health Research, Kaiser Permanente Northwest, Portland, OR, USA; 11Center for Integrated Health Care Research, Kaiser Permanente Hawai’i, Honolulu, HI, USA; 12Department of Research and Evaluation, Kaiser Permanente Southern California, Los Angeles, CA, USA; 13Baylor, Scott & White Research Institute, Temple, TX, USA; 14Center for Health Policy and Health Services Research, Henry Ford Health System, Detroit, MI, USA; 15Division of Gastroenterology and Hepatology, Henry Ford Health System; and Wayne State University School of Medicine, Detroit, MI, USA

Correspondence: Mei Lu
Department of Public Health Sciences, Henry Ford Health System, 3E One Ford Place, Detroit, MI 48202, USA
Tel +1 313 874 6413
Fax +1 313 874 6730
Email [email protected]

Background: Biopsy remains the gold standard for determining fibrosis stage in patients with primary biliary cholangitis (PBC), but it is unavailable for most patients. We used data from the 11 US health systems in the FibrOtic Liver Disease Consortium to explore a combination of biochemical markers and electronic health record (EHR)-based diagnosis/procedure codes (DPCs) to identify the presence of cirrhosis in PBC patients.
Methods: Histological fibrosis staging data were obtained from liver biopsies. Variables considered for the model included demographics (age, gender, race, ethnicity), total bilirubin, alkaline phosphatase, albumin, aspartate aminotransferase (AST) to platelet ratio index (APRI), Fibrosis 4 (FIB4) index, AST to alanine aminotransferase (ALT) ratio, and > 100 DPCs associated with cirrhosis/decompensated cirrhosis, categorized into ten clusters. Using least absolute shrinkage and selection operator regression (LASSO), we derived and validated cutoffs for identifying cirrhosis.
Results: Among 4328 PBC patients, 1350 (32%) had biopsy data; 121 (9%) were staged F4 (cirrhosis). DPC clusters (including codes related to cirrhosis and hepatocellular carcinoma diagnoses/procedures), Hispanic ethnicity, ALP, AST/ALT ratio, and total bilirubin were retained in the final model (AUROC=0.86 and 0.83 on learning and testing data, respectively); this model with two cutoffs divided patients into three categories (no cirrhosis, indeterminate, and cirrhosis) with specificities of 81.8% (for no cirrhosis) and 80.3% (for cirrhosis). A model excluding DPCs retained ALP, AST/ALT ratio, total bilirubin, Hispanic ethnicity, and gender (AUROC=0.81 and 0.78 on learning and testing data, respectively).
Conclusion: An algorithm using laboratory results and DPCs can categorize a majority of PBC patients as cirrhotic or noncirrhotic with high accuracy (with a small remaining group of patients’ cirrhosis status indeterminate). In the absence of biopsy data, this EHR-based model can be used to identify cirrhosis in cohorts of PBC patients for research and/or clinical follow-up.

Keywords: primary biliary cirrhosis, cholangitis, race/gender/ethnicity, gender, ethnicity, decompensated cirrhosis, ursodeoxycholic acid, UCDA


Although biopsy remains the gold standard for determining liver damage, fibrosis, and cirrhosis in patients with primary biliary cholangitis (PBC), it is invasive and performed on a relatively small subset of patients. Transient elastography has shown promise for use in PBC patients1,2 but has not been universally implemented in health care systems that are not supported by specialty gastroenterology and hepatology clinics. An efficient system to identify cirrhosis in PBC patients using data from electronic health records (EHR)—such as diagnosis and procedure codes (DPCs), and laboratory results—may inform epidemiologic research and clinical trials, as well as the identification of subgroups of PBC patients who could benefit from earlier intervention.

Biomarkers for liver fibrosis calculated from results of laboratory tests, such as the Aspartate Aminotransferase to Platelet Ratio Index (APRI) and Fibrosis-4 (FIB4), have been well described and validated among patients with viral hepatitis. However, the distinct etiology and natural history of PBC mean that the ability of these biomarkers to identify cirrhosis cannot be assumed, and there are currently no studies developing or validating PBC-specific cutoffs for cirrhosis. Likewise, elevated alkaline phosphatase (ALP), total bilirubin, and the ratio of aspartate aminotransferase to alanine aminotransferase (AST/ALT) are known to be important prognostic markers for PBC progression and response to treatment with ursodeoxycholic acid (UDCA).3,4 It is likely that the inclusion of these variables could increase the utility of any marker for cirrhosis among patients with PBC.

The FibrOtic Liver Disease Consortium (FOLD) comprises a cohort of more than 4000 PBC patients drawn from 11 US health systems. We applied machine learning techniques to develop and validate an automated algorithm combining EHR-based data—including DPCs and routine laboratory results—for the identification of cirrhosis among patients with PBC.


The FOLD Consortium has been previously described.3,5 Briefly, FOLD comprises 11 geographically diverse health systems, representing four US Census Bureau-defined regions of the US (Northeast, Midwest, Northwest, and South). FOLD follows the guidelines of the US Department of Health and Human Services for the protection of human subjects. The study protocol was approved by the Institutional Review Board of each participating site (see Supplementary Table 1). All authors had access to the study results and reviewed and approved the final manuscript.

Cirrhosis Cohort Identification

FOLD PBC patient identification methods have been previously described.5 All cases were confirmed with chart abstraction performed by trained medical abstractors. We used EHR data to identify FOLD PBC patients who had undergone liver biopsy. Fibrosis staging from biopsy results was collected by abstraction from pathology reports, and mapped to an F0–F4 equivalency scale:6 F0, no fibrosis; F1, portal fibrosis without septa; F2, portal fibrosis with few septa; F3, numerous septa without cirrhosis; and F4, cirrhosis. FOLD hepatologists provided adjudication of indeterminate cases. If the patient had more than one biopsy, the biopsy with the highest fibrosis stage was considered. The outcome of interest was a biopsy with F4/cirrhosis biopsy staging.

Possible Covariates/Classifiers

Table 1 details covariates considered for the model, including patient demographics (age, gender, race, Hispanic ethnicity); total bilirubin; ALP and albumin (classified in relation to “normal” as defined by the assay used at each site); APRI; FIB4 index; and AST/ALT ratio. We collected laboratory data, liver-related diagnosis and procedure codes (International Classification of Diseases Ninth and Tenth editions [ICD9/10] and Current Procedural Terminology [CPT] codes), all measured within six months before/after biopsy. In cases where more than one laboratory result was available, the one closest to the date of biopsy was used. ICD9/10 and CPT codes were grouped into ten clusters (C1 to C10, detailed in Table 2); these were used as dichotomized (presence/absence) variables in the classification analysis. An “unknown” category was used for all variables that had missing data.

Table 1 Laboratory Data

Table 2 ICD-9/10 and CPT Codes Comprising the Ten Cluster Variables (C1–C10)

Statistical Analysis

Data were randomly divided into two sets at a 2:1 ratio; learning data (2/3) were used to build the classification model and testing data (1/3) were used for model validation. We performed analysis using machine learning approaches to build the model, including SPM (Salford Predictive Modeler, version 8.0) Least Absolute Shrinkage and Selection Operator (LASSO)7 and several machine-leaning R packages,8 including Classification and Regression Tree (CART), K-Nearest-Neighbor (KNN), polynomial support vector machines (SVMs), neural networks, random forest models, and eXtreme Gradient Boosting (xgb)Trees.810

The model building process started with variable selection for the initial multivariable model. Highly correlated variables (eg, AST/ALT ratio, APRI, and FIB4) were fit into the model one at a time with other covariates to determine which would be selected. The same modeling approach was repeated using laboratory data without DPCs, given that FIB4 (a commonly used laboratory data-based biomarker) has been used to identify cirrhosis among patients with chronic hepatitis.6 The final modeling approach and multivariable model were selected for optimal classification accuracy (defined by the highest area under the receiver operating characteristic curve [AUROC]). Final model selection was based on classification accuracy in the validation set, with estimation of model goodness-of-fit measured by AUROC. Models are considered to have “reasonable” and “good” accuracy when the AUROC is 70–80% and 80–90%, respectively. We also identified an optimal cut-off point to provide clinical utility to correctly classify patients as either cirrhotic or non-cirrhotic.


Among 4328 confirmed PBC patients observed from 2006 to 2016, 1350 (32%) had biopsy data with F0–F4 staging; 121 (9%) were histologically staged F4 (cirrhosis). The median number of biopsies per patient was 1; 25th and 75th percentiles were 1 and 1 with a range of 1 to 7. Table 3 presents the two-group comparison for all covariates of interest.

Table 3 Two-Group Comparison for Covariates of Interest

The LASSO approach—using three laboratory variables (ALP, total bilirubin, AST/ALT), gender, and ethnicity—had “good” model classification accuracy; AUROC was 0.81 (learning data) and 0.78 (testing data). The model equation is expressed as: Lscore = −2.10444 - 0.0380115 [if ALP normal] - 0.10366 [if ALP 1-<2*ULN] + 0.0703424 [if ALP 2-<3*ULN] - 0.0859862 [if ALP≥3*ULN] - 0.0961179 [if male] + 0.0552183 [if non-Hispanic ethnicity] - 0.0557998 [if Hispanic ethnicity] - 0.0553804 [if bilirubin ≤0.4] - 0.0701528 [if bilirubin 0.5>0.4] - 0.0239056 [if bilirubin 0.7>0.5] - 0.0881663 [if bilirubin 1.0>0.7] + 0.0658567 [if bilirubin 1.5>1.0] + 0.115739 [if bilirubin 2.0>1.5] + 0.136269 [if bilirubin >2.0] - 0.0865529 [if AST/ALT<1.1] + 0.0898492 [if AST/ALT 1.1-<2.2] + 0.144404 [if AST/ALT ≥2.2]. At the optimal cutoff of 0.1 in this model, sensitivity was 70% and specificity was 72% based on validation results.

A LASSO model with three laboratory variables (ALP, total bilirubin, AST/ALT), two DPC clusters for diagnosis of hepatocellular carcinoma and cirrhosis, and ethnicity (Hispanic yes/no) demonstrated the best performance; this model reached “excellent” classification accuracy, with AUROCs of 0.86 on learning data and 0.83 on testing data. This model combining laboratory and DPC data had significantly better predictive ability (AUROC) compared to the model using laboratory data without DPCs (p=0.001). The equation for this final LASSO model is expressed as : Lscore = −2.80400 + 0.303777 [if ALP 2-<3*ULN] - 1.11856 [if Hispanic ethnicity] - 0.325175 [if bilirubin ≤0.4] + 0.28772 [if bilirubin >0.5-0.7 mg/dL] + 0.512881 [if bilirubin >1.0-1.5 mg/dL] + 0.9406 [if bilirubin >1.5-2.0 mg/dL] + 0.756801 [if bilirubin >2.0 mg/dL] - 0.652222 [if AST/ALT<1.1] + 0.645455 [if AST/ALT 1.1-<2.2] + 0.681193 [if AST/ALT ≥2.2] + 0.707349 [if diagnosis of hepatocellular carcinoma] + 1.3713 [if two diagnoses of cirrhosis]. A single cut-off of 0.08 (derived from the formula Prob = 1.0/(1.0 + exp(-score)) yielded sensitivity of 76% and specificity of 75% based on validation results. Two optimal cut-offs (0.07 and 0.10) divided patients into three groups—non-cirrhotic (≤0.07); indeterminate 0.7≤0.10); and cirrhotic (>0.10)—and yielded improved specificities of 81.8% for absence of cirrhosis and 80.3% for presence of cirrhosis.

Other modelling approaches using the same covariates reached similar or lower model classification accuracy (Supplementary Table 2); performance of the xgbTree model was similar to the LASSO model (AUROC=0.82 on testing data) but required ten variables (age, gender, ethnicity, albumin, ALP, AST/ALT ratio, total bilirubin, platelet count, and DPCs related to hepatocellular carcinoma and cirrhosis), making this model less useful in the “real world” setting.


Using data drawn from the FOLD consortium, we applied machine learning methods to EHR-based laboratory results and DPCs to develop and validate a method for identifying cirrhosis among patients with PBC. Our previous work has shown that cirrhosis is an important prognostic marker for poor outcomes among patients with PBC.35 However, in these analyses, cirrhosis identification was based on a limited number of patients with biopsy data (32%). Transient elastography has gradually begun to replace biopsy, but has not yet been universally implemented, especially in health systems without specialty hepatology clinics; only 12% of patients in our real-world cohort had elastography data available. Our EHR-based model could help address that gap in the identification of PBC patients with cirrhosis. The classification accuracy of our model using both laboratory data and DPC codes was “good” (AUROC=0.83 on testing data) and was significantly better than an alternate model using laboratory results without DPCs. The combined model with two-optimal cuts (0.07 and 0.10) divided patients into three groups (cirrhotic and non-cirrhotic, with a small group [<7%] as indeterminate); this model yielded 81.8% specificity for the absence of cirrhosis and 80.3% specificity for the presence of cirrhosis.

We believe this is the first validated model for use of EHR-based data for cirrhosis identification among PBC patients. Although previously developed markers for cirrhosis, such as APRI (which combines AST, ALT, and platelet count) and FIB4 (which combines age, AST, ALT, and platelet count), have been validated in populations with viral hepatitis, it is not clear if they are optimized for use in patients with PBC. In a model replacing AST/ALT ratio with FIB4, classification accuracy was moderate (AUROC=0.75 on testing data). Our analysis found that a combination of total bilirubin, ALP, and AST/ALT ratio—rather than APRI, FIB4, or the individual components of those markers—provided better accuracy (AUROC=0.83 on testing data). In light of our recent study showing that total bilirubin, ALP, and AST/ALT ratio were independent risk factors for all-cause mortality in patients with PBC,4 our current findings suggest that these variables may be the most appropriate biomarkers for cirrhosis and poor outcomes.

One limitation of our analysis is that—although the overall model classification accuracy reached 83%—sensitivity and specificity remained only moderate (75–76%) with the use of a single cut-off (0.08). We addressed this issue with the use of two cutoffs (0.07 and 0.10), which improved specificity to >80% for determining the absence of cirrhosis and presence of cirrhosis, and left only 6.8% of patients classified as “indeterminate.” While the use of more extreme cutoffs (eg, 0.05 and 0.16) could yield specificity in the range of 85–88%, it would classify more patients (28%) as indeterminate. Limitations to this method can be further addressed by using a hierarchical approach for cirrhosis identification: 1) cirrhosis determination based on biopsy or transient elastography when available; 2) use of our model with two cutoffs. Analyses can categorize those patients who fall into the “indeterminate” group as “unknown.” We have successfully implemented a similar approach for cirrhosis identification in patients with viral hepatitis.11,12 An additional unavoidable limitation of classification models that they are most accurate when applied to a sample with patient characteristics similar to those used to build the model. Likewise, this model will need to be validated using external data from a similar patient population.

In conclusion, our study showed that a model using EHR-based data can be used to efficiently identify PBC patients with cirrhosis. Using a hierarchical approach that also takes into consideration cirrhosis determination via biopsy/transient elastography data, when such data are available, we expect that this model will be useful for research in patients with PBC, and could serve as a quality improvement tool to ensure the best available care for such patients. Our model may also be useful in the identification of risk factors for decompensation in large observational studies of patients with PBC. There are interventions that mitigate risk of cirrhotic patients' progression from compensated to decompensated cirrhosis, and poor outcomes of decompensation—this tool may help clinicians identify and monitor such patients.

Author Contributions

All authors made substantial contributions to conception and design, acquisition of data, or analysis and interpretation of data; took part in drafting the article or revising it critically for important intellectual content; agreed to submit to the current journal; gave final approval of the version to be published; and agree to be accountable for all aspects of the work.


The FOLD Consortium has previously received funding from Intercept Pharmaceuticals Inc.


Stuart C. Gordon receives grant/research support from AbbVie Pharmaceuticals, Conatus, CymaBay, Eiger Pharmaceuticals, Eli Lilly, Genfit, Gilead Sciences, GlaxoSmithKline, Intercept Pharmaceuticals, Merck, and Viking Therapeutics. Mei Lu, Joseph A. Boscarino, Mark A. Schmidt, Yihe G. Daida, Jia Li, Loralee B. Rupp, and Sheri Trudeau receive research grant support from Gilead Sciences and Intercept Pharmaceuticals. Carla V. Rodriguez-Watson owns stock in Gilead (<$5000). Heather Anderson receives grant/research support from Intercept Pharmaceuticals. Jeffrey J. VanWormer receives grant/research support from Retrophin. Christopher L. Bowlus receives grant/research support from AbbVie Pharmaceuticals, Bristol-Myers-Squibb, CymaBay, Gilead Biosciences, GlaxoSmithKline, Intercept Pharmaceuticals, Merck, Mirum, Shire Pharmaceuticals, Takeda Pharmaceuticals, TARGET Pharmasolutions, and has served as an advisor for Bristol-Myers-Squibb, Gilead Biosciences, Intercept Pharmaceuticals, and Takeda. Keith Lindor is a consultant/advisor for Biopharma and has served as an ad hoc advisor for HighTide, Takeda, Shire, and Intercept Pharmaceuticals. He sits on a Data Safety Monitoring Board for Takeda. Robert J. Romanelli receives received grant/research support from Pfizer Inc. and Janssen Scientific Affairs. The authors report no other conflicts of interest in this work.


1. Wong VW-S, Chan HL-Y. Transient elastography. J Gastroenterol Hepatol. 2010;25(11):1726–1731. doi:10.1111/j.1440-1746.2010.06437.x

2. Joshita S, Yamashita Y, Sugiura A, et al. Clinical utility of FibroScan as a non-invasive diagnostic test for primary biliary cholangitis. J Gastroenterol Hepatol. 2019.

3. Lu M, Zhou Y, Haller IV, et al. Increasing prevalence of primary biliary cholangitis and reduced mortality with treatment. Clin Gastroenterol Hepatol. 2018;16(8):1342–1350. doi:10.1016/j.cgh.2017.12.033

4. Gordon SC, Wu KH, Lindor K, et al. Ursodeoxycholic acid treatment preferentially improves overall survival among African Americans with primary biliary cholangitis. Am J Gastroenterol. 2020;115(2):262–270. doi:10.14309/ajg.0000000000000512

5. Lu M, Li J, Haller IV, et al. Factors associated with prevalence and treatment of primary biliary cholangitis in United States health systems. Clin Gastroenterol Hepatol. 2018;16(8):1333–1341. doi:10.1016/j.cgh.2017.10.018

6. Li J, Gordon SC, Rupp LB, et al. The validity of serum markers for fibrosis staging in chronic hepatitis B and C. J Viral Hepat. 2014;21(12):930–937. doi:10.1111/jvh.12224

7. CART 6.0 User’s Guide Salford Systems [computer program]. 2010.

8. Torsten H CRAN task view: machine learning & statistical learning. 2020. Available from: Accessed October 1, 2020.

9. Friedman J, Hastie T, Tibshirani R. Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors). Ann Statist. 2000;28(2):337–407. doi:10.1214/aos/1016218223

10. Friedman JH. Greedy function approximation: A gradient boosting machine. Ann Statist. 2001;29(5):1189–1232. doi:10.1214/aos/1013203451

11. Li J,Zhang T, Gordon SC. Does Hepatitis C eradication lead to improved glucose metabolism, renal and cardiovascular outcomes in diabetic patients? American Association for the Study of Liver Diseases (AASLD) 2017 Auunal Meeting. 2017:ID: 981.

12. Lu M, Wu KH, Li J, et al. Adjuvant ribavirin and longer direct-acting antiviral treatment duration improve sustained virological response among hepatitis C patients at risk of treatment failure. J Viral Hepat. 2019;26(10):1210–1217. doi:10.1111/jvh.13162

Creative Commons License © 2020 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.