Back to Journals » Clinical Epidemiology » Volume 17

Refined Algorithm for Identifying Recurrence Among Patients with Non-Metastatic Colorectal Cancer Based on Danish National Health Data Registries

Authors Gögenur M ORCID logo, Bräuner KB, Löffler L, Olsen ASF, Gundestrup AK ORCID logo, Jakobsen PCH, Kleif J, Bertelsen CA ORCID logo, Gögenur I

Received 30 April 2025

Accepted for publication 27 November 2025

Published 11 December 2025 Volume 2025:17 Pages 1075—1086

DOI https://doi.org/10.2147/CLEP.S532957

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Dr Thomas Ahern



Mikail Gögenur,1 Karoline Bendix Bräuner,1 Lea Löffler,1 Anna Sofie Friis Olsen,2 Anders Kierkegaard Gundestrup,2 Peter Cornelius Helbo Jakobsen,2 Jakob Kleif,2,3 Claus Anders Bertelsen,2,3 Ismail Gögenur1,3

1Center for Surgical Science, Department of Surgery, Zealand University Hospital, Koge, Denmark; 2Department of Surgery, Copenhagen University Hospital, North Zealand, Denmark; 3Institute for Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark

Correspondence: Mikail Gögenur, Email [email protected]

Purpose: In the Danish and other national health registries, colorectal cancer (CRC) recurrence is not routinely registered. Algorithms to label patients with recurrence in Denmark exist but produce cohorts with a risk of selection bias due to either pre- or postoperative exclusion criteria. In this study, we aimed to refine and increase the generalizability of an existing registry-based algorithm.
Patients and Methods: Data from 5077 patients from an institution and a regional database, encompassing several departments of surgery in Denmark, were retrieved. Patients with non-metastatic CRC were included from 2008 to 2019. Electronic health journal-based recurrence registration was used as reference for the algorithm. Patients were linked with data from the Danish Colorectal Cancer Group database, the Danish National Health Registry, the Danish Cancer Registry, and the Danish Pathology Registry. The algorithm utilized metastasis, chemotherapy, pathology, and local recurrence codes. Refinement of the algorithm included the addition of targeted and radiation therapy codes and including patients who died within 180 days after surgery, along with revising the pathology codes and removing any preoperative exclusion criteria. Performance metrics were evaluated in 10,000 bootstrapped runs, while all-stage and stage-specific cumulative incidence of recurrence and overall survival were estimated.
Results: The refined algorithm included more patients than the conventional algorithm (4388 vs 3684) and performed marginally better in terms of sensitivity (0.92 (95% CI 0.89– 0.94) vs 0.90 (95% CI 0.87– 0.92)) and specificity (0.97 (95% CI 0.97– 0.98) vs 0.96 (95% CI 0.95– 0.96). A significant difference in cumulative incidence of recurrence for UICC stage I was detected between the conventional algorithm and reference, which was not significant when using the refined algorithm.
Conclusion: The refined algorithm improves identification of CRC recurrence in national data, enabling broader inclusion and better representation of population subgroups.

Keywords: colorectal cancer, recurrence, registry-based algorithm

Introduction

Colorectal cancer (CRC) remains the second leading cause of cancer-related death worldwide, with up to a third of patients experiencing a recurrence of their disease after curative-intent treatment.1,2 Recurrences are the leading cause of cancer-specific death in CRC.

Despite the importance of recurrences for the patient’s trajectories, the incidence of recurrence is not registered systematically in Denmark and other countries. Efforts have been made to develop and validate algorithms based on national health or claims data.3–5 In Denmark, the Lash algorithm was developed in 2015 by Lash and coworkers, which uses Danish national health data registries to determine the incidence of recurrence after curative-intent surgery of non-metastatic CRC.6 A recent update and validation of the Lash algorithm showed good performance in a contemporary cohort of 522 patients with non-metastatic CRC, with a sensitivity of 94% and specificity of 99%.7 However, the Lash algorithm excludes patients with previous cancers and those who die within six months after surgery. This decision may be due to the utilization of codes that are not specific for CRC and thus may lead to misclassification of patients as having a recurrence. However, excluding these patients limits the generalizability of the algorithm, as these steps can lead to the exclusion of up to 25% of the general CRC patient population.2

This study aimed to investigate if fewer exclusion criteria and an update of the codes utilized by the Lash algorithm would result in similar performance metrics compared to the Lash algorithm while at the same time reducing the number of excluded patients.

Methods

Data Sources and Data Collection

A total of 713 patients from the database from the Department of Surgery, Zealand University Hospital, in the period January 2015 to August 2019. In total 4364 patients from the Copenhagen cOmplete Mesocolic Excision Study (COMES) database from the Department of Surgery, Copenhagen University Hospital–North Zealand, in the period June 2008 to December 2017, were available. The database from Zealand University Hospital included patients from the Departments of Surgery in Zealand University Hospital and Slagelse Hospital, with both providing the CRC treatment for 840.000 inhabitants in Region Zealand. As described previously, the COMES database is a regional database covering the four university clinics providing CRC treatment for the 1.9 million inhabitants of the Capital Region of Denmark.8,9 Using an unique patient identifier, patients from these databases were linked with data from the Danish Colorectal Cancer Database (DCCG),10 the Danish National Registry of Patients (DNPR),11 the Danish Cancer Registry (DCR),12 and the Danish National Pathology Registry (DPR).13 DCCG is a national quality assurance database for CRC treatment in Denmark, with information on patient demographics, cancer, treatment, and postoperative complications. The DNPR contains data on treatments and procedures from all Danish hospitals since 1977. The DCR contains data on incidence of cancer since 1943. The DPR contains information on nationwide pathology specimens since 1997. Data from the DNPR, DCR, and DPR were used to determine the incidence of recurrence, while DCCG provided patient characteristics.

Cohort

Patients from both databases with representation in the DCCG database, who were above 18 years old, were included in the study cohort. The conventional approach to determine recurrences was termed the “Lash-rule algorithm”, and the updated version was termed the “Center for Surgical Science (CSS)-rule algorithm”.

For the Lash-rule algorithm, patients with any preoperative metastasis, previous CRC, or cancers other than CRC were excluded, as described elsewhere.7 Patients with deaths, metastasis codes, or new primary cancers other than CRC within 180 days after surgery were also excluded.

For the CSS-rule algorithm, no preoperative exclusion criteria were applied, patients who died within six months after surgery were included, while patients with metastasis codes or new primary cancer other than CRC within 180 days after surgery were excluded.

Recurrence Detection by Algorithm

The Lash-rule algorithm has been described in detail elsewhere.6,7 Briefly, it considers the following combinations of registry codes as indicating recurrence:

  1. Metastases codes (ICD-10, DC76-DC80) registered in either DNPR or DCR 180 or more days after surgery
  2. Chemotherapy codes (BWHA1-2, BOHJ17, or BOHJ19B114) registered in DNPR at least 180 or more days after surgery and 60 days after the last chemotherapy code and registered by an oncological department
  3. Pathology: The following combinations of SNOMED codes registered in DPR:15
    1. Colon or rectum-related T-codes (T6491, T65900, T65901, T65925, T65926, T660, T67 or T68) in combination with the morphology codes M8 or M9 with three or more in the fifth position (ie M9XXX3)
    2. Any T-code with the morphology codes M8 or M9 with either 4, 6, or 7 at the fifth position
    3. Liver related T-code T56 in combination with M81403
  4. Local colon or rectum recurrence codes (DC189X or DC209X) registered in DNPR

For all the above steps, recurrence was only considered if the codes appeared before a new primary cancer other than CRC.

For the CSS-rule algorithm, the following combinations were considered indicating recurrence:

  1. Metastases codes (ICD-10, DC76-DC80) registered in either DNPR or DCR 180 or more days after surgery
  2. Palliation codes (ICD-10 DS515 or DS515S) registered in DNPR 180 or more days after surgery
  3. Chemotherapy, targeted treatment, and radiation codes (BWHA1-2, BOHJ17, BOHJ19B1, BOHJ19D1, BOHJ19J3, BWGC2, BWGC23, BWGC4 or BWGC4A) registered in DNPR at least 180 or more days after surgery, and 60 days after last chemotherapy code, and registered by an oncological department. BOHJ19B1, BOHJ19D1, BOHJ19H2, BOHJ19J3, and BOHJ19C2 only needed to be registered 180 or more days after surgery to indicate recurrence
  4. Pathology: The following combinations of SNOMED registered in DPR where there on the same date was not registered the SNOMED codes P30755 (indicating re-analysis of previously obtained tissue), ÆF4998 or ÆF4999 (both indicating the origin of cancer is unknown):
    1. Colon or rectum-related T-codes (T6491, T65900, T65901, T65925, T65926, T67, or T68) in combination with the morphology codes M8 or M9 with four or more in the fifth position (ie M9XXX4)
    2. Any T-code with the morphology codes M8 or M9 with either 4, 6, or 7 at the fifth position
    3. Liver related T-code T56 in combination with M81403
    4. Recurrence or local recurrence-related SNOMED codes in DPR (ÆYYY07 or ÆYYY17)
  5. Local colon or rectum recurrence codes (DC189X or DC209X) registered in DNPR

For all the above steps, recurrence was only considered if the codes appeared before a new primary cancer other than CRC.

The CSS-rule algorithm recurrence codes thus differed from the Lash-rule algorithm by the addition of palliation codes, additional targeted treatment, and radiation codes, disregarding samples with the mentioned P and Æ SNOMED codes, disregarding M8XXX3 and M9XXX3 in 3.i and disregarding T660 codes, and the addition of recurrence or local recurrence related SNOMED codes in 4.iv.

Electronic Health Journal-Based Recurrence Detection

A recurrence was recorded if the electronic health record noted either local recurrence or metastases to distant organs, including peritoneal carcinomatosis. The date for recurrence was based on the recurrence diagnosis noted in either a multidisciplinary team meeting, radiological assessment, pathological assessment, or patient record. Patients were followed up for recurrence until May 2020. Recurrences registered in electronic health journals were used as reference, and recurrence registration was done separately by surgeons linked to the different institutions.

Statistical Analysis

Patients detected by the Lash-rule algorithm were included in the Lash-rule cohort, while patients detected by the CSS-rule algorithm were included in the CSS-rule cohort. The Lash-rule and CSS-rule cohort were compared in terms of the continuous variable age (presented as median and range) and the categorical variables (presented as absolute number and percentage) gender, cancer type, UICC stage, ASA, WHO performance score, Charlson comorbidity index (CCI),16 surgical and medical complications, death, recurrence reference, and recurrence determined via algorithm. Variables besides recurrence information were derived from DCCG.

The Lash-rule and CSS-rule algorithms were evaluated with kappa statistics, true negatives, false negatives, true positives, false positives, sensitivity, specificity, accuracy, positive predictive value, and negative predictive value with recurrence from the electronic health journal as reference. To investigate the robustness of the algorithms, these performance metrics were calculated from contingency tables from 1000 boot-strapped runs with replacement allowed and similar study cohort size. Sensitivity, specificity, and accuracy were visually presented in box plots with mean and interquartile range (IQR), while the remaining were presented with mean and 95% CI in tabular format. Performance metrics were similarly evaluated in two separate 1000 boot-strapped runs. In the first, the study cohort was split stratified for recurrence, to investigate whether performance would differ across splits. In the second, the data was split according to database (CSS or COMES) to investigate whether performance was different across databases.

Categorical values were compared using a chi-squared test. The five-year cumulative incidence rate of recurrence (both reference and algorithm-derived) for the study cohort and stratified for UICC stage, with death as a competing event, was reported with 95% CIs. Five-year overall survival was analyzed with a Log rank test and presented as a Kaplan-Meier curve, with patients being censored if they were alive at end of follow-up. Time to recurrence (TTR) was analyzed in patients that were identified by Lash-rule algorithm and reference in the Lash-rule cohort, and CSS-rule algorithm and reference in the CSS-rule cohort. TTR for the algorithms were estimated by subtracting the recurrence date as estimated by either algorithm with the surgery date, and similarly for the reference. Correlation was estimated by Spearman’s correlation, while a linear model was used to estimated R2 and the 95% CI. Only patients with at least 1.5 years of follow-up were included for cumulative incidence of recurrence, overall survival, and TTR analysis.

Statistical analysis was performed using R (version 4.2.0) (R Foundation for Statistical Computing, Vienna, Austria). A p-value below 0.05 was considered statistically significant. The study adhered to the STROBE guidelines.17

Ethics

The Department of Surgery, Zealand University Hospital database was approved with nr. REG-018-2022 and R-22010564, while the COMES database was approved with nr. HIH-2013-031. Under Danish law, informed consent is not required when working with registry data.

Results

Lash-Rule and CSS-Rule Cohorts

A total of 5077 patients were available from the two databases. Applying the revised exclusion criteria for the CSS-rule algorithm, 4388 patients were available (CSS-rule cohort), while 3684 patients were available, using the exclusion criteria for the Lash-rule algorithm (Lash-rule cohort). A flow chart of the two algorithms is presented in Figure 1. As indicated in Figure 1, the exclusion criteria for the Lash-rule algorithm excluded 25.2% of patients from the cohort (3684/4928), while 11.0% were excluded using the CSS-rule algorithm (4388/4928). The median follow-up was 4.7 years (IQR range 3.2–7.2).

Figure 1 Flow chart of Lash-rule and CSS-rule algorithm. Yellow boxes indicate exclusion criteria. Red boxes indicate analyses performed.

Baseline Characteristics

Age and gender were comparable across the two cohorts, along with cancer type and UICC stage. There was a difference in patients with ASA 3+ (16.9% vs 14.7%) and CCI 3+ (9.6% vs 6.5%) in the CSS-rule algorithm vs the Lash-rule algorithm. There was a higher mortality in the CSS-rule cohort (28.4% vs 24.4%) compared with the Lash-rule cohort. The percentage of recurrences was comparable (11.9% vs 12.7%). Baseline characteristics are presented in Table 1.

Table 1 Baseline Characteristics

Performance Metrics

To compare the sensitivity, specificity, and accuracy of the CSS-rule and Lash-rule algorithm, we performed a bootstrapped analysis of the CSS-rule algorithm in the CSS-rule cohort, CSS-rule algorithm in the Lash-rule cohort, and Lash-rule algorithm in the Lash-rule cohort, with 10,000 iterations of each. In Figure 2, the accuracy, sensitivity, and specificity of the analysis are depicted. The CSS-rule algorithm performs comparably in the CSS-rule cohort and Lash-rule cohort, with an accuracy of 0.96 (95% CI 0.96–0.97) vs 0.96 (95% CI 0.96–0.97), sensitivity of 0.92 (95% CI 0.89–0.94) vs 0.92 (95% CI 0.90–0.95), and specificity of 0.97 (95% CI 0.97–0.98) vs 0.97 (95% CI 0.96–0.98). The Lash-rule algorithm had, in comparison, a lower accuracy (0.95 (95% CI 0.94–0.96)), sensitivity (0.90 (95% CI 0.87–0.92)), and specificity (0.96 (95% CI 0.95–0.96). Other performance metrics are shown in Table 2.

Table 2 Overview of the Performance Metrics and Cohen’s Kappa. Data are Based on a Bootstrapped Model, Where Most Runs Included Approximately the Same Number of Patients as the Original Cohort

Figure 2 Cumulative incidence of recurrence curves. (A) Cumulative incidence of recurrence in the Lash-rule cohort of the reference, CSS-rule algorithm (CSSrule_Lash) and Lash-rule algorithm (Lashrule). (B) Cumulative incidence of recurrence in the CSS-rule cohort of the reference and CSS-rule algorithm (CSSrule). (C) Cumulative incidence of recurrence stratified for UICC in the Lash-rule cohort of the reference (ref) and Lash-rule algorithm (Lashrule). (D) Cumulative incidence of recurrence stratified for UICC in the CSS-rule cohort of the reference (ref) and CSS-rule algorithm (CSSrule).

To explore potential biases in developing the CSS-rule algorithm, we split the CSS-rule cohort into five splits, stratified for recurrence events, and performed a bootstrap analysis over 10,000 iterations. Supplementary Figure 1 shows minimal differences across splits. Similarly, we investigated if there were any differences in performance metrics across the two institutions’ cohorts separately in a boot-strapped model with 10,000 iterations, finding no substantial variations (Supplementary Figure 2).

Cumulative Incidence of Recurrence

We investigated cumulative recurrence incidence in the Lash-rule cohort, applying the Lash-rule and CSS-rule algorithms alongside the reference, as shown in Figure 2A. At one year, the cumulative incidence of recurrence was 3.7%, with the CSS-rule algorithm estimating 2.8% and the Lash-rule algorithm estimating 3.2%, with all having overlapping 95% CIs. At three years, the cumulative incidence of recurrence was 10.9%, while the CSS-rule estimated 10.8%, and the Lash-rule estimated 11.2%, with all having overlapping 95% CIs. At the five-year mark, the cumulative incidence of recurrence was 12.8%, while the CSS-rule estimated 13.8% and the Lash-rule estimated 14.4%, with all having overlapping 95% CIs.

In the CSS-rule cohort, the cumulative incidence of recurrence was 3.5%, 10.2%, and 12.1% after one, three, and five years, respectively (Figure 2B). The CSS-rule algorithm estimated cumulative incidence to be 2.7%, 10.3%, and 13.0% after one, three, and five years, respectively, with all 95% CIs overlapping.

Stage-Specific Cumulative Incidence of Recurrence

To investigate the two algorithms’ stage-specific performance, we performed a UICC stratified cumulative incidence of recurrence analysis in the Lash-rule cohort (Figure 2C). There was a significant difference between the five-year reference cumulative incidence of recurrence for UICC I, 3.5% (95% CI, 2.3–4.9), while the Lash-rule algorithm estimated a cumulative incidence of 6.9% (95% CI, 5.2–8.9). The 95% CIs overlapped between reference and the Lash-rule algorithm for the remaining stages.

For the CSS-rule algorithm in the CSS-rule cohort, all 95% CIs overlapped between the reference and the CSS-rule algorithm (Figure 2D).

Overall Survival

An essential difference between the CSS-rule and Lash-rule algorithms is excluding patients who die within 180 days after surgery. In the CSS-rule algorithm, this exclusion criterion is not part of the algorithm. In the Kaplan-Meier survival curve depicted in supplementary Figure 3, the survival curves differ in the first six months and then are parallel for the remaining period, as expected.

Time to Recurrence

To investigate the difference in TTR between the Lash algorithm and reference, and CSS-rule algorithm and reference, the correlation between the TTR of the Lash algorithm and reference in the Lash-rule cohort is depicted in Figure 3A and the correlation between the CSS-rule algorithm and reference in the CSS-rule cohort is depicted in Figure 3B. The median difference in TTR was 23 days (IQR 7–51) for the Lash algorithm, with a spearman p of 0.88 and R2 of 0.76 (95% CI, 0.71–0.80). The median difference in TTR was 25 days (IQR 9–51) for the CSS-rule algorithm, with a spearman p of 0.85 and R2 of 0.73 (95% CI, 0.69–0.77).

Figure 3 Time to recurrence. (A) Comparison of time to recurrence in Lashrule cohort. (B) Comparison of time to recurrence in CSSrule cohort.

Refining the Algorithm

It was evident that the Lash-rule algorithm significantly overestimated the cumulative incidence of recurrence in all patients and in those with UICC I cancers. This prompted an investigation into which codes contributed the most to the false positive rate. Before changes in the utilized codes, the number of false negatives was 48 and false positives 177. The morphology codes M8XXX3 and M9XXX3 were shown to contribute the most to the false positive rate. Removal of these morphology codes increased the number of false negatives to 59 and decreased the false positives to 97. Disregarding the SNOMED codes P30755, ÆF4998 and ÆF4999, and adding local recurrence associated SNOMED codes did not change the number of false negatives but reduced the false positives to 96. Adding the targeted treatment and radiation codes reduced number of false negatives to 54 and increased the false positives to 100. The addition of palliation codes reduced the number of false negatives to 44 and increased false positives to 112. Thus, the changes to the algorithm yielded four less false negatives and 65 less false positives.

Discussion

In this study, we updated and refined an algorithm to detect recurrences following curative-intended surgery for non-metastatic CRC in national health registries. The updated algorithm did not exclude any patients preoperatively, including 14% more patients in the cohort, while maintaining excellent performance metrics and remaining robust in boot-strapped scenarios.

The refinement of the recurrence detection algorithm addresses key limitations of previous approaches by increasing the patient cohort size while maintaining high specificity and sensitivity. The reduction of false positives, particularly in UICC stage I disease, ensures that recurrence risk estimates align more accurately with reference data. This has direct clinical implications, as accurate recurrence classification is essential for tailoring surveillance strategies and evaluating the role of novel adjuvant therapies.18

Although the exclusion of preoperative cancers can result in excellent performance metrics in algorithms to detect recurrence in a cancer population the true background population is often not accounted for and may result in health-related inequality due to lack of generalizability.19 By removing preoperative exclusion criteria, we therefore allow for a more comprehensive analysis of treatment outcomes, particularly in patients who may experience early postoperative mortality, a group often underrepresented in outcome studies.20 These refinements support future efforts in AI-driven oncological risk stratification and underscore the need for ongoing validation in broader patient cohorts.21,22

The enhanced algorithm provides a robust foundation to enrich existing registry data sources, so that AI-driven models can be developed aimed at predicting recurrence risk and informing clinical decision-making. By accurately identifying patients at varying risk levels, these models can guide the allocation of healthcare resources and the design of personalized treatment plans. This advancement holds promise for improving patient outcomes through more targeted and effective interventions.

Accurate recurrence detection is crucial for optimizing postoperative surveillance and tailoring adjuvant therapies. The refined algorithm’s improved specificity, particularly in UICC stage I patients, minimizes overestimation of recurrence risk, thereby preventing unnecessary interventions and focusing resources on high-risk individuals. This precision facilitates personalized treatment plans, enhancing the efficacy of surveillance protocols and therapeutic approaches.

There is an increasing focus on utilizing the period before surgery to treat patients with different neoadjuvant regimens to decrease tumor burden and reduce the risk of recurrences. These treatments may influence surgical outcomes and lead to deaths in the postoperative period. Thus, including these patients provide valuable information regarding safety assessments in phase four studies and to evaluate trends over time. Similarly, there is an increasing focus on health economic assessment of both established and new treatments within the oncological field. Recently, it has been shown that even though new treatments within specific medical fields have contributed to an increase in quality-adjusted life years (QALY), with an even higher QALY estimation, if these funds were directed to established treatments for the general population.23 Such analyses are critical and valuable in the current era of medicine, where budgets may not allow all new and expensive treatments to be available for patients. Since mortality is a critical part of the QALY estimate, including the patients who die within 180 days after surgery facilitates a generalizable cohort for such analyses. Comparisons with other countries also necessitates an algorithm that includes these patients.

In recent validations of the Lash-rule algorithm, the sensitivity was 88–94%, and the specificity was 96–99%.7,24 However, they have been made on relatively small cohorts. Our cohort includes two separate databases from different institutions, with almost a ten-fold increase in cohort size compared with previous validations. This may explain the difference in performance metrics. Still, as showcased in the boot-strapped run of the CSS-rule algorithm in the databases individually, similar performance metrics were obtained, underlining the algorithm’s robustness. Compared with a recent application of the Lash-rule algorithm on the whole of DCCG, comparable five-year cumulative incidence of recurrence rates for all patients and stage-specific were evident.2 The algorithms were similarly comparable in terms of TTR compared with the reference in either cohort.

In another study, the authors developed an algorithm based upon claims data from a single hospital.4 Here, in 147 patients, they utilized diagnosis codes indicating secondary malignant neoplasms and chemotherapy codes with a sensitivity of 81% and specificity of 99%. In the Lash-rule and CSS-rule algorithm, additional pathology codes are utilized, which may explain the difference in sensitivity, alongside the difference in number of patients assessed.

For both algorithms, early recurrences within 1.5 years after surgery were not detected at the same rate as in other time points. Although the codes encompass the relevant diagnosis, treatments, pathology codes, and palliation codes, there is still a discrepancy compared to the reference. Some of this may be due to death close to the diagnosis of recurrence, and some may be due to wrong coding. A manual review of registry codes for the false negatives showed no singular code that could improve performance. A machine learning approach that may combine codes and include the frequency and temporality of codes may be able to identify these patients, as this remains an essential shortcoming of both algorithms.

This study was limited in terms of a low number of patients with rectal cancers being present. The COMES database included only colon cancers. The latest entry for patients was May 2020, and with immune checkpoint inhibitor treatment for metastatic deficient mismatch repair tumors first being introduced as a standard treatment after this date, it was not possible to test if the addition of this code aided in labeling patients as having a recurrence. Moreover, the refined algorithm was not externally validated in a separate cohort.

With increasing treatment opportunities in the neoadjuvant setting, prediction of recurrence before surgery may stratify patients for additional treatment before surgery, to increase the possibility of a recurrence-free postoperative trajectory. This requires larger cohorts that resemble the general population. With this updated algorithm, the inclusion of patients with previous cancers and patients who die within six months after surgery, the algorithm contributes to this goal. Many of the available prediction models for recurrence do not apply a non-death within 180 days criterion, which is why the inclusion of these patients is vital for adequate performance of these models and subsequent model development in a Danish setting.25,26

Conclusion

We present an updated algorithm for identifying recurrence in patients with non-metastatic colorectal cancer using national health registries. By expanding patient inclusion and revising recurrence criteria, the refined approach enables the development of more representative cohorts.

Data Sharing Statement

Data from the Zealand University Hospital and COMES cohort are available upon reasonable request from the authors. Data from the national health registries cannot be shared according to Danish legislation.

Acknowledgments

The following colleagues have contributed to the COMES database, thus indirectly to this study: Jens Erik Jansen, M.D.; Lars Vedel Jepsen, M.D.; and Leif Ahrenst Rasmussen, M.D., Department of Surgery, Copenhagen University Hospital–North Zealand, Hillerød, Denmark; Anders Kirkegaard-Klitbo, M.D., and Jutaka Reilin Tenma, M.D., Department of Surgery, Copenhagen University Hospital–Bispebjerg, Copenhagen, Denmark; Pernille Wolder Born, M.D.; and Michael Wilhelmsen, M.D., Gastro Unit, Surgical Division, Copenhagen University Hospital–Hvidovre, Hvidovre, Denmark; Else Refsgaard Iversen, M.D., Department of Surgery, Birgitte Bols, M.D., and Peter Ingeholm, M.D., Department of Pathology, and Bent Kristensen, M.D., Department of Clinical Physiology, Copenhagen University Hospital–Herlev, Herlev, Denmark. The authors thank Maliha Mashkoor for initial development of the algorithm. CB and IG are shared last authors.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Disclosure

Dr Mikail Gögenur reports grants from Novo Nordisk Foundation, grants from Innovative Medicines Initiative 2 Joint Undertaking, grants from Danish Ministry of Higher Education and Science, during the conduct of the study. The authors declare no other competing interests in this work.

References

1. Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74:229–263. doi:10.3322/caac.21834

2. Nors J, Iversen LH, Erichsen R, Gotschalck KA, Andersen CL. Incidence of recurrence and time to recurrence in stage I to III colorectal cancer: a nationwide Danish cohort study. JAMA Oncol. 2024;10:54–62. doi:10.1001/jamaoncol.2023.5098

3. Mariotto AB, Thompson TD, Johnson C, Wu X-C, Pollack LA. Breast and colorectal cancer recurrence-free survival estimates in the US: modeling versus active data collection. Cancer Epidemiol. 2023;85:102370. doi:10.1016/j.canep.2023.102370

4. Deshpande AD, Schootman M, Mayer A. Development of a claims-based algorithm to identify colorectal cancer recurrence. Ann Epidemiol. 2015;25:297–300. doi:10.1016/j.annepidem.2015.01.005

5. Hassett MJ, Uno H, Cronin AM, et al. Detecting lung and colorectal cancer recurrence using structured clinical/administrative data to enable outcomes research and population health management. Med Care. 2017;55(12):e88–e98. doi:10.1097/MLR.0000000000000404

6. Lash TL, Riis AH, Ostenfeld EB, et al. A validated algorithm to ascertain colorectal cancer recurrence using registry resources in Denmark. Int J Cancer. 2015;136:2210–2215. doi:10.1002/ijc.29267

7. Nors J, Mattesen TB, Cronin-Fenton D, et al. Identifying recurrences among non-metastatic colorectal cancer patients using national health data registries: validation and optimization of a registry-based algorithm in a modern Danish cohort. Clin Epidemiol. 2023;15:241–250. doi:10.2147/CLEP.S396140

8. Bertelsen CA, Neuenschwander AU, Jansen JE, et al. Disease-free survival after complete mesocolic excision compared with conventional colon cancer surgery: a retrospective, population-based study. Lancet Oncol. 2015;16(2):161–168. doi:10.1016/S1470-2045(14)71168-4

9. Bertelsen CA, Neuenschwander AU, Kleif J; COMES Study Group. Risk of local recurrence after complete mesocolic excision for right-sided colon cancer: post-hoc sensitivity analysis of a population-based study. Dis Colon Rectum. 2022;65:1103–1111. doi:10.1097/DCR.0000000000002174

10. Klein MF, Gögenur I, Ingeholm P, et al. Validation of the Danish Colorectal Cancer Group (DCCG.dk) database – on behalf of the Danish Colorectal Cancer Group. Colorectal Dis. 2020;22(12):2057–2067. doi:10.1111/codi.15352

11. Schmidt M, Schmidt SAJ, Sandegaard JL, et al. The Danish National patient registry: a review of content, data quality, and research potential. Clin Epidemiol. 2015;7:449–490. doi:10.2147/CLEP.S91125

12. Gjerstorff ML. The Danish cancer registry. Scand J Public Health. 2011;39:42–45. doi:10.1177/1403494810393562

13. Erichsen R, Lash TL, Hamilton-Dutoit SJ, et al. Existing data sources for clinical epidemiology: the Danish National Pathology Registry and Data Bank. Clin Epidemiol. 2010;2:51–56. doi:10.2147/CLEP.S9908

14. SKS-browseren. Available from: https://www.medinfo.dk/sks/brows.php. Accessed December 4, 2025.

15. Patobank. En landsdækkende databank fra pato-anatomiske undersøgelser. Available from: https://www.patobank.dk/. Accessed December 4, 2025.

16. Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40:373–383. doi:10.1016/0021-9681(87)90171-8

17. von Elm E, Altman DG, Egger M, et al. The strengthening the reporting of observational studies in epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet. 2007;370(9596):1453–1457. doi:10.1016/S0140-6736(07)61602-X

18. Lambert P, Pitz M, Singh H, Decker K. Evaluation of algorithms using administrative health and structured electronic medical record data to determine breast and colorectal cancer recurrence in a Canadian province: using algorithms to determine breast and colorectal cancer recurrence. BMC Cancer. 2021;21:763. doi:10.1186/s12885-021-08526-9

19. Rasmussen LA, Jensen H, Virgilsen LF, Hölmich LR, Vedsted P. A validated register-based algorithm to identify patients diagnosed with recurrence of malignant melanoma in Denmark. Clin Epidemiol. 2021;13:207–214. doi:10.2147/CLEP.S295844

20. Dorcaratto D, Mazzinari G, Fernandez M, et al. Impact of postoperative complications on survival and recurrence after resection of colorectal liver metastases: systematic review and meta-analysis. Ann Surg. 2019;270(6):1018–1027. doi:10.1097/SLA.0000000000003254

21. Sirinukunwattana K, Domingo E, Richman SD, et al. Image-based consensus molecular subtype (imCMS) classification of colorectal cancer using deep learning. Gut. 2021;70(3):544–554. doi:10.1136/gutjnl-2019-319866

22. Henriksen TV, Tarazona N, Frydendahl A, et al. Circulating tumor DNA in stage III colorectal cancer, beyond minimal residual disease detection, toward assessment of adjuvant therapy efficacy and clinical behavior of recurrences. Clin Cancer Res. 2022;28(3):507–517. doi:10.1158/1078-0432.CCR-21-2404

23. Naci H, Murphy P, Woods B, et al. Population-health impact of new drugs recommended by the National Institute for Health and Care Excellence in England during 2000-20: a retrospective analysis. Lancet. 2025;405:50–60. doi:10.1016/S0140-6736(24)02352-3

24. Colov EP, Fransgaard T, Klein M, Gögenur I. Validation of a register-based algorithm for recurrence in rectal cancer. Dan Med J. 2018;65:A5507.

25. Weiser MR, Hsu M, Bauer PS, et al. Clinical calculator based on molecular and clinicopathologic characteristics predicts recurrence following resection of stage I-III colon cancer. J Clin Oncol. 2021;39(8):911–919. doi:10.1200/JCO.20.02553

26. Osterman E, Ekström J, Sjöblom T, et al. Accurate population-based model for individual prediction of colon cancer recurrence. Acta Oncologica. 2021;60(10):1241–1249. doi:10.1080/0284186X.2021.1953138

Creative Commons License © 2025 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms and incorporate the Creative Commons Attribution - Non Commercial (unported, 4.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.