Performance Trends and Grading Variability in Clinical Education: A Record-Based Cross-Sectional Study of Medical Graduate Students at Mutah University, Jordan

Khitam Al-Refu; Israa Al-Rawashdeh; Nedal Alnawaiseh; Fadi Sawaqed; Yousef Al-Saraireh

doi:10.2147/AMEP.S553555

Back to Journals » Advances in Medical Education and Practice » Volume 16

Original Research

Performance Trends and Grading Variability in Clinical Education: A Record-Based Cross-Sectional Study of Medical Graduate Students at Mutah University, Jordan

Authors Al-Refu K, Al-Rawashdeh I , Alnawaiseh N , Sawaqed F , Al-Saraireh Y

Received 4 August 2025

Accepted for publication 29 November 2025

Published 24 December 2025 Volume 2025:16 Pages 2409—2417

DOI https://doi.org/10.2147/AMEP.S553555

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Dr Md Anwarul Azim Majumder

Download Article [PDF]

Khitam Al-Refu,¹ Israa Al-Rawashdeh,² Nedal Alnawaiseh,² Fadi Sawaqed,³ Yousef Al-Saraireh⁴

¹Department of Internal Medicine, Faculty of Medicine, Mutah University, Karak, Jordan; ²Department of Public Health and Community Medicine, Faculty of Medicine, Mutah University, Karak, Jordan; ³Department of Special Surgery, Faculty of Medicine, Mutah University, Karak, Jordan; ⁴Department of Pharmacology, Faculty of Medicine, Mutah University, Karak, Jordan

Correspondence: Israa Al-Rawashdeh, Faculty of Medicine, Mutah University, Karak, Jordan, Tel +962-7-9502-4441, Email [email protected]

Objective: To examine five-year trends in academic performance across four final-year clinical subjects and evaluate the relationship between pre-admission scores and university academic outcomes. It also assessed the variability in grading practices across four clinical departments.
Methods: A record-based cross-sectional study was conducted among 1584 medical graduate students (cohorts 2020– 2024) at Mutah University, Jordan. Data included General Secondary School Examination (Tawjihi) scores, grades in four core clinical subjects (Internal Medicine, Surgery, Pediatrics, and Obstetrics-Gynecology) and cumulative Grade Point Average (GPA). Analysis used descriptive statistics, Pearson correlation coefficients, analysis of variance and regression models.
Results: Tawjihi scores showed a weak but statistically significant correlation with GPA (r = 0.257; p < 0.001). Subject grades correlated significantly with GPA (Pediatrics r = 0.810), Obstetrics-Gynecology r = 0.786, Internal Medicine r = 0.779, Surgery r = 0.628; all p < 0.001). Year-to-year differences were significant across all subjects (F = 6.84– 42.44, all p < 0.001). Regression analysis revealed that the practical component (β = 0.602; p < 0.001) slightly outweighed written exams (β = 0.547; p < 0.001) in predicting final course grades.
Conclusion: Grading variability across departments highlight the importance of standardization in clinical evaluation. These findings emphasize the need for unified assessment frameworks to enhance equity and reliability in medical education.

Keywords: medical education, clinical academic performance, medical schools, grading variability, Jordan

Introduction

Medical education in Jordan generally adheres to a systematic progression beginning with three basic pre-clinical years focused on foundational sciences. Followed by three years of hospital-based clinical training focusing on hands-on practice and patient care. The sixth is the final year and is particularly noteworthy as it represents a crucial phase of both academic and professional transition. During this year, students complete clinical rotations across four core subjects: Internal Medicine, Surgery, Pediatrics, and Obstetrics-Gynecology. This phase is not only the last step before graduating but contribute significantly to 25% of the student’s graduating cumulative Grade Point Average (GPA) in accordance with the official regulations for Awarding the Bachelor’s Degree in Doctor of Medicine (MD) at Mutah University.¹ Assessments are competency-based and are numerically scored out of 100. They include final written exam (40%) which assesses the student’s theoretical knowledge and clinical reasoning, OSCEs (40%), and evaluations of clinical performance throughout hospital rotations (20%) encompassing factors such as professionalism, attendance, and the punctual completion of assigned duties during hospital rotations.

Internationally, clinical assessment systems show considerable variability and inconsistencies in grading processes across disciplines and institutions. In United States, for example, failure rates remain exceptionally low, reported at under 1%; however, the proportion of students receiving top grades varies dramatically—from 2% to over 90%—indicating inconsistencies in evaluation standards and raising possible concerns with grade inflation or subjectivity.^2,3 Furthermore, additional research suggests that the order and timing of clinical rotations may greatly influence students’ performance scores, particularly in the areas of Surgery and Obstetrics-Gynecology adding to the complexity of interpretation of the final-year grades.^4–6 Beyond subject-specific academic grading variability, concerns are also raised regarding the reliability of clinical performance evaluation.⁷ Discrepancies in evaluator expectations, and absence of standardized evaluator training can all contribute to inconsistent assessments.⁸ This has resulted in a significant gap in the comprehension of the interplay between these factors within various educational settings.

Limited empirical research has explored trends and related factors of assessment practices among medical students within the Jordanian context. However, some studies from similar contexts such as the Gulf-region have explored the predictive validity of secondary school grades and admission test scores.^5,9,10 To the best of our knowledge, no similar study has yet investigated subject-specific trends or grading variability among medical graduate students in Jordan. This scarcity of such research in Jordan limits the ability of local institutions to evaluate and standardize clinical education outcomes effectively.

Grading variabilities may extend beyond academic implications, as they might influence students’ career progression such as residency placement opportunities, perception of fairness or overall academic equity. In addition, variations in assessment practices across institutions may also affect the comparability of graduates’ performance nationally. Medical schools in Jordan operate under university-specific assessment systems with limited standardization or unified evaluation frameworks.

Recent literature has highlighted the need for fairness, transparency, and calibration among assessors to ensure equity and reliability in clinical evaluation processes across medical programs. Studies have found longitudinal variations between knowledge, skills, and professionalism components of assessment¹¹ and significant variability in clerkship grade distributions across institutions.¹² Concerns regarding imprecise clinical assessments leading to inaccurate grading affecting residency program directors’ ability to classify and differentiate student applicants are reported.¹³

All in all, the current five-year study (2020–2024) aims to address this knowledge gap by evaluating academic performance among medical graduate students at the faculty of Medicine in Mutah University. It investigates the relationship between clinical subject grades and cumulative GPA, assesses grading variability across departments and explores the structure and consistency of clinical assessments to inform future curricular and evaluation reforms.

Ultimately, the objective is to generate evidence-based insights that can inform enhancements in curriculum design, clinical evaluation methodologies, and overarching academic policies. Although this study was conducted at the Faculty of Medicine at Mutah University—established in 2001 to expand access to medical education in southern Jordan—its findings have broader relevance. There are six medical schools in six governmental universities in Jordan, and the results of this research could inform national discussions on assessment practices, promote equity, and enhance the quality of clinical education across all institutions. The research hopes to help establish a more equitable, consistent, and competency-driven framework for medical education that aligns with national and international standards for quality assurance and accreditation.

Objectives

Based on the rationale outline above, the following are the specific objectives:

To Examine performance trends and grading variability across four core clinical subjects: Internal Medicine, Surgery, Pediatrics, and Obstetrics-Gynecology.
Explore the correlation between National Secondary School Examination (Tawjihi) scores and academic performance represented in cumulative GPA.
Assess the correlation between sixth-year clinical subjects’ grades and cumulative GPA.
Investigate interdepartmental variabilities in grading practices and performance evaluation.
Analyze the relationship between subject assessment components (written exam and practical evaluation) and their predictive value for final subject grades.

Methods

Study Design and Setting

This research comprised a comprehensive record-based, cross-sectional design of five cohorts of medical graduate students. This study was conducted at the Faculty of Medicine, Mutah University, a well-respected governmental medical institution in Southern Jordan. The dataset included academic records for medical graduate students’ cohorts who graduated between 2020 and 2024. This structured approach facilitated a thorough evaluation of the clinical academic performance and variables influencing students’ performance during this crucial phase of their medical training.

Study Population

The study included records of 1584 medical graduate students who had completed their clinical training and obtained their degrees between 2020 and 2024. Students’ records were excluded if they were incomplete, missed one or more clinical rotations or missed scores for core assessments components (OSCE, written exam or the clinical evaluation). This was to ensure the accuracy and relevance of the study’s objectives.

Data Sources, Collection and Variables

Data were obtained from the official academic records archived by the Unit of Administration and Registration at Mutah University. The dataset included the following variables for each student:

Final grades (out of 100) in the four core sixth-year clinical subjects: Internal Medicine, Surgery, Pediatrics, and Obstetrics-Gynecology.
Final cumulative GPA at graduation (on a 100-point scale).
General Secondary School Examination (Tawjihi) scores, which serves as the national entrance qualification for university-level medical education in Jordan (on a 100-point scale) or its officially recognized equivalent for non-Jordanian certificates.
Grades of subcomponents of each of the four core clinical subjects, including: A combined practical Score (60%) and Final written exam score (40%). The combined practical score included OSCE (40%) and evaluation of clinical performance evaluation (20%). The OSCE assessed clinical reasoning, procedural skills, and communication abilities under standardized settings. Whereas, the clinical performance evaluation reflects ongoing performance during rotations which includes professionalism, attendance and task completion. Although item level data for each score could not be retrieved from records and were aggregated for analysis, this unified grading scheme was applied consistently across all departments and cohorts to ensure comparability of the practical assessment structure.

Data Management and Statistical Analysis

Collected data were cleaned and analyzed using SPSS version 23 after extraction from Microsoft Excel files. The statistical analyses tests included: Descriptive statistics showing means, standard deviations to assess the degree of variation among students’ scores, providing insight into the variability of performance levels, minimum and maximum values. Pearson correlation coefficients was also used to examine the relationships between the Tawjihi scores, individual subject grades, and cumulative GPA. Additionally, One-way ANOVA test was employed to explore whether there were significant differences in subject grades across the five graduated cohorts. Post hoc analysis (LSD test) was applied to identify specific year-to-year variability in clinical academic performance when ANOVA results were significant. A p-value of <0.05 was considered statistically significant.

Ethical Considerations

The research was approved by the Research and Ethics Committee of the Faculty of Medicine at Mutah University (Approval No. 480/25). The requirement for informed consent was waived by the committee as the study involved retrospective analysis of anonymized academic records with no direct contact with students or use of identifiable data. All student data was managed with utmost confidentiality and the data were used exclusively for research-related purposes only. The analysis did not incorporate any identifiable personal information. This study was conducted in accordance with the ethical principles of the Declaration of Helsinki.

Results

Descriptive Results

A total of 1584 medical graduate students’ records who graduated between 2020 and 2024 were included in the statistical analysis of the current study. Table 1 describes the distribution of cumulative GPA across graduating cohorts (out of 100), including the mean, the Standard Deviation, and the observed minimum and maximum scores. Figure 1 illustrates the trends in mean GPA across the five academic years (2020–2024), showing a generally stable pattern with slight fluctuations between cohorts. The overall cumulative GPA ranged between 62.0 and 92.3 with a mean cumulative GPA score of 74.3 ± 5.65. Across the five cohorts, GPA varied. In 2020, the mean GPA was 72.75 ± 5.03, while the 2021 cohort showed a mean of 73.06 ± 5.25. In 2022, students achieved the highest average GPA of 75.44 ± 5.68. The 2023 cohort followed with a slightly lower mean of 74.97 ± 4.96. In 2024 cohort, an average of 74.18 ± 6.41 was achieved. These findings suggest variability across cohorts, with relatively stable academic performance over time.

Table 1 Cumulative GPA Summary by Cohort (2020–2024). (N=1584)

Figure 1 Trends in mean GPA across academic years (2020–2024).

Comparative Performance Across Clinical Subjects

Four core clinical subjects’ grades were examined. As demonstrated in Table 2, Internal medicine had achieved the highest mean score at 74.48 ±7.52, followed closely by Surgery (74.38±7.54) and then Pediatrics (71.43 ± 8.78). Obstetrics-Gynecology showed the lowest mean score of 68.11 ± 7.86 across the five cohorts.

Table 2 Descriptive Statistics for Core Clinical Subject Grades Across Academic Years (2020–2024). (N = 1584)

Correlation Analysis Between Tawjihi Scores, Clinical Subject Grades and Cumulative GPA

A Pearson correlation matrix as illustrated in Table 3 shows that Tawjihi scores and cumulative GPA have weak but statistically significant positive correlation (r = 0.257, p < 0.001), indicating that pre-university academic performance contributes modestly to graduation grades in medical school. On the other hand, a strong correlation was found between individual clinical subject grades and cumulative GPA. The highest correlation was observed with Pediatrics (r = 0.810), followed by Obstetrics-Gynecology (r = 0.786), Internal Medicine (r = 0.779), and Surgery (r = 0.628); where all associations were statistically significant (p < 0.001).

Table 3 Pearson Correlation Between Academic Indicators and Cumulative GPA (N = 1584)

Interdepartmental Variabilities in Grading Practices and Performance Evaluation

One-way ANOVA tests were conducted for each clinical subject across the five academic years (2020–2024) to determine whether there were differences in grading practices and evaluation of performance across cohorts in the four clinical core subjects (Obstetrics-Gynecology, Pediatrics, Surgery and Internal Medicine). The analysis revealed that statistically significant differences were observed in all four subjects throughout the five years as shown in Table 4. A post-hoc analysis of Least Significant Difference (LSD) Model used to identify specific year-to-year differences. The analysis showed that Obstetrics-Gynecology grades fluctuated significantly across all cohorts (2020–2024). Whilst, Pediatrics grades differed notably between 2020 and 2021 cohorts. Regarding Surgery, 2023 grades were significantly different from those of the remaining cohorts. Internal medicine grades in 2024 also showed a significant shift when compared to earlier cohorts.

Table 4 Year-to-Year Variation in Clinical Subject Grades (ANOVA Results). (N = 1584)

Relationship Between Clinical Performance Scores and Subject Grades

To better understand grade composition, the internal components of clinical subject assessment were analyzed using correlational and regression approaches. Each subject grade was comprised of a Final Written Exam (40%) and a Practical Evaluation (60%), the latter subdivided into Clinical Performance and OSCE components.

As shown in Table 5, Analysis revealed a significant strong correlation between both components and the final subject grade (r = 0.857 for written exam, r = 0.883 for practical evaluation; p < 0.001), with the strongest overall correlation observed between the practical evaluation and final subject grade. For individual subject, correlation coefficients ranged from r = 0.866 to 0.932, all statistically significant (p < 0.001). In Pediatrics, the correlation between OSCE/clinical performance scores and total grades was highest (r = 0.920), followed by Obstetrics-Gynecology (r = 0.916), Internal Medicine (r = 0.900), and Surgery (r = 0.866). This suggests that the clinical performance and OSCE scores strongly influence individual total course grades, particularly in Pediatrics and Obstetrics-Gynecology. Multiple linear regression analysis shows that standardized beta coefficients (β = 0.602 for practical evaluation and β = 0.547 for written exam) which can suggest that; despite equal formal weighting, practical assessments may have a slightly greater influence on the distribution of final grades among students.

Table 5 Pearson Correlation Analysis of Four Core Clinical Subject Grades with Final Course Grades (N = 1584)

Discussion

This study found notable trends and variability in the assessment of the academic performance in four core clinical subjects. Across the five-year studied cohorts, both Internal medicine and Surgery subjects consistently achieved higher average scores with relatively narrower standard deviations, suggesting greater consistency in curriculum structure, assessment practices and clinical training environments. On the other hand, Obstetrics-Gynecology and Pediatrics exhibited broader variability in students’ scores in terms of both range and standard deviation. This may reflect discrepancies in evaluation standards.

These findings align with concerns in medical education literature reporting variabilities in workplace based assessments. For example, notable assessor-related variability is found when comparing students’ and supervisors’ mini clinical evaluation exercise (mini-CEX) tool scores in undergraduate medical clerkships challenging the reliability of such assessment tools for summative evaluations.¹⁴ Similarly, inter-rater variability is reported to exceed student-related differences in clinical performance, particularly in subjects such as Obstetrics and Pediatrics.^15–17 Moreover, Literature highlights how inflated or compressed grading distributions in certain clinical subjects can influence the fairness of student comparisons and subsequent academic evaluation-related decisions.^2,18

The present study found a statistically significant but modest positive correlation between the General secondary school Examination (Tawjihi) scores and the cumulative GPA at graduation (r = 0.257, p < 0.001). This indicates that although higher Tawjihi scores may reflect a baseline level of academic competency; however, these scores ability to predict students’ long-term academic multi-year cumulative achievements in medical school might be partial. These findings are consistent with previous studies from nearby countries which have examined the predictive power of high school averages on performance in clinical or applied phases in medical education. Two studies from Saudi Arabia and one from Bahrain have reported that pre-admission variables, including secondary school grades, had limited predictive power on progress test performance in clinical years and the graduation GPA.^19–21 These findings could be explained by the increasing degree of complexity when shifting from theoretical knowledge often focused on during the secondary school education to higher and more complex levels in applied competencies required in clinical training and medical school. However, success in medical school is reported to be influenced by multiple factors beyond prior academic performance and in shaping cumulative achievements of the medical students. Among these factors, literature reports the role of professional character related factors such as motivation, resilience, communication skills and adaptability in clinical education environments in shaping medical students’ and residents’ performance.²²

The current study revealed a statistically significant positive correlation between individual clinical subjects and the cumulative GPA, with correlation coefficients ranging between r = 0.628 in Surgery and r = 0.810 in Pediatrics. Given that the grades in the final-year subjects collectively account for 25% of the GPA calculation under Mutah University’s credit system, this helps explain the strength of the observed correlations. Thus, clinical performance in these subjects carries a substantial weight in shaping the overall academic standing at graduation. Similar patterns of the strong alignment between the performance in key clinical subjects on both applied clinical competencies and further ongoing academic paths are reported. For instance, studies found that clinical grades in the final year of medical school strongly predicted both final academic ranking and performance on licensing examinations.^23,24 Moreover, a systematic review that included literature published between 1997 and 2015 concluded that final-year OSCE and written exam performance are influential on long-term academic distinctions and postgraduate placement opportunities.²⁵

In addition, the strength of these associations may also reflect the cumulative nature of medical education, where knowledge and skills developed over earlier years culminate in final-year assessments that are integrative in nature. Moreover, the assessment modalities—combining OSCEs, clinical evaluations, and written exams—likely offer a more holistic view of student competencies compared to isolated theoretical assessments.

ANOVA and post-hoc analyses findings in this study revealed a clear interdepartmental differences in both assessment practices and scores variability. Although all departments follow a unified grading structure comprising of 40% for written exam and 60% of practical evaluation, score dispersion and performance deviations varied significantly between subjects and between year cohorts. Obstetrics-Gynecology being with the lowest average and the broader standard deviations indicating variable performance distribution. Similarly, Paediatrics showed a broad distribution of grades across cohorts, suggesting challenges in ensuring uniform evaluation practices or comparable learning experiences. These challenges are documented in other studies from UK and Canada where similar concerns regarding subjectivity and departmental assessment disparities were raised.^26,27 These disparities may be explained partially due to departmental culture, supervision practices and models, availability and opportunities of clinical exposure in addition to assessors training by the faculty.

All in all, this study has limitations, the cross-sectional analysis limited the ability to track individual student progress over time. The analysis of variations between OSCE and clinical performance assessments may have been constrained by the unavailability of subdivided clinical evaluation data which led to reliance on aggregated clinical evaluation grades. Finally, the single-institution setting may limit generalizability.

Conclusion

The current study which used a record-based cross-sectional design provides a context-specific evidence performance trends and grading variability in clinical education at Mutah University in Jordan over five academic years. The results revealed strong correlations between clinical subject grades and cumulative GPA. It also revealed interdepartmental variabilities patterns that suggest potential discrepancies in evaluation and lack of standardization and unified supervision models. These variabilities, if left unaddressed, may affect students’ final academic standings, residency placements, and future opportunities.

Moreover, the current study found the predictive strength of pre-admission to medical school scores such as Tawjihi is modest. Whilst they remain useful for initial selection in medical school, there is still a rising need for more comprehensive and multidimensional admission frameworks that can better reflect students’ clinical and professional potential within Jordanian context.

The regression analysis showed that both written and practical components significantly predicted subject grades, with slightly greater influence from practical assessments. While this may reflect the standard composition of the total score, it highlights the proportional academic weight each component carries and supports the internal validity of the scoring structure. These findings reinforce the importance of maintaining clear assessment rubrics and consistency across departments to ensure fairness and transparency in grade calculation.

At a broader level, the results underscore the importance of initiating policies that promote unified evaluation systems, standardized assessor training and equitable grading practices on a national level across Jordan. These insights may also inform future reforms in admission frameworks and clinical curriculum design to align local standards with international best practice.

Recommendations

Further research is needed to explore additional predictors of clinical success beyond secondary school scores. Longitudinal designs may be used in future studies to incorporate comprehensive assessment data, and include multiple institutions to enhance the generalizability and depth of findings. These findings underscore the need for ongoing national dialogue on assessment standardization and fairness across Jordanian medical schools, contributing to broader educational quality reforms.

Acknowledgment

The authors would like to express their sincere appreciation to the Administration and Registration Unit at Mutah University for their official cooperation and assistance in granting access to academic records essential for this study. The authors also acknowledge the valuable contributions of Eng. Amal Al-Majali in organizing, managing, and preparing the data for analysis.

Disclosure

The authors declare no conflicts of interest in this work.

References

1. Mutah University. Instructions for awarding the bachelor’s degree in doctor of medicine (Instruction No. 379) [Arabic]. Al-Karak: Mutah University; 2006.

2. Alexander EK, Osman NY, Walling JL, Mitchell VG. Variation and imprecision of clerkship grading in US medical schools. Acad Med. 2012;87(8):1070–1076. doi:10.1097/ACM.0b013e31825d0a2a

3. Poe W. Predicting clinical performance in medical school: the contribution of academic and non-academic characteristics [master’s thesis]. 2012.

4. Hampton HL, Collins BJ, Perry KG, Meydrech EF, Wiser WL, Morrison JC. Order of rotation in third-year clerkships: influence on academic performance. J Reprod Med. 1996;41(5):337–340.

5. Al Alwan I. Association between scores in high school, aptitude and achievement exams and early performance in health science college. Saudi J Kidney Dis Transpl. 2009;20(3):448–453.

6. Vaughan M, Johnson KA, Bergin CR. Impact of clerkship length and sequence on NBME subject exam performance. Med Sci Educ. 2025;35(1):1313–1322. doi:10.1007/s40670-025-02305-y

7. Khan K, Ramachandran S. Conceptual framework for performance assessment: competency, competence and performance in the context of assessments in healthcare—deciphering the terminology. Med Teach. 2012;34(11):920–928. doi:10.3109/0142159X.2012.722707

8. Hasnain M, Connell KJ, Downing SM, Olthoff A, Yudkowsky R. Toward meaningful evaluation of clinical competence: the role of direct observation in clerkship ratings. Acad Med. 2004;79(Suppl):S21–S24. doi:10.1097/00001888-200410001-00007

9. Almarzouki HS, Al Qurashi M, Al Mansour M, et al. Correlation of undergraduate GPA with grades at Saudi licensing exams: can grades predict future academic performance? Health Prof Educ. 2025;11(2):74–82.

10. Hendi A, Mahfouz MS, Alqassim AY, et al. Admission grades as predictors of medical students’ academic performance: a cross-sectional study from Saudi Arabia. Eur J Investig Health Psychol Educ. 2022;12(7):1572–1580. doi:10.3390/ejihpe12110110

11. Matos Sousa R, Collares CF, Pereira VH. Longitudinal variation of correlations between different components of assessment within a medical school. BMC Med Educ. 2024;24(1):241. doi:10.1186/s12909-024-05822-3

12. Hoy JF, Shuman SL, Smith SR, Kogan M, Simcock XC. Analysis of variability and trends in medical school clerkship grades. Surg Open Sci. 2024;19(2):80–86. doi:10.1016/j.sopen.2024.03.010

13. Sarkar A, Heidelbaugh JJ, Hallbauer G, Appelbaum NP. Imprecise clinical assessments and inaccurate grades: family medicine clerkship director perspectives. Fam Med. 2024;56(6):471–475. doi:10.22454/FamMed.2024.819598

14. Berendonk C, Rogausch A, Gemperli A, Himmel W. Variability and dimensionality of students’ and supervisors’ mini-CEX scores in undergraduate medical clerkships: a multilevel factor analysis. BMC Med Educ. 2018;18(1):100. doi:10.1186/s12909-018-1207-1

15. Chen KT, Baecher-Lind L, Morosky CM, et al. Current practices and perspectives on clerkship grading in obstetrics and gynecology. Am J Obstet Gynecol. 2024;230(1):97.e1–97.e6. doi:10.1016/j.ajog.2023.09.020

16. Shui ML, Armstrong W, Altendahl M, et al. An analysis of obstetrics and gynecology medical student performance evaluation clerkship narratives: insights from the PRIME+ framework. J Grad Med Educ. 2025;17(3):189–195. doi:10.4300/JGME-D-24-00660.1

17. Hernandez C, Daroowalla F, LaRochelle J, et al. Determining grades in the internal medicine clerkship: results of a national survey of clerkship directors. Acad Med. 2020;95(7):1175–1183.

18. Gough HG, Hall WB. The prediction of academic and clinical performance in medical school. Res High Educ. 1975;3(4):301–314. doi:10.1007/BF00991247

19. Dabaliz AA, Kaadan S, Dabbagh MM, et al. Predictive validity of pre-admission assessments on medical student performance. Int J Med Educ. 2017;8:408–413. doi:10.5116/ijme.5a10.04e1

20. Alrukban MO, Munshi FM, Abdulghani HM, Al Hoqail I. The ability of the pre-admission criteria to predict performance in a Saudi medical school. J Family Community Med. 2010;17(2):67–71.

21. Almarabheh A, Ismaeel A, Atwa H, Jaradat A, Jaradat A. Predictive validity of admission criteria in predicting academic performance of medical students: a retrospective cohort study. Front Med. 2022;9:842359. doi:10.3389/fmed.2022.971926

22. Kenny NP, Mann KV, MacLeod H. Role modeling in physicians’ professional formation: reconsidering an essential but untapped educational strategy. Acad Med. 2003;78(12):1203–1210. doi:10.1097/00001888-200312000-00002

23. Gauer JL, Jackson JB. The association between United States medical licensing examination scores and clinical performance in medical students. Adv Med Educ Pract. 2019;10:209–216. doi:10.2147/AMEP.S192011

24. Aharonian K, Sanders M, Schlesinger T, Winter V, Simanton E. Predictive validity of preclerkship performance metrics on USMLE Step 2 CK outcomes in the Step 1 pass/fail era. Adv Med Educ Pract. 2025;16(2):323–330. doi:10.2147/AMEP.S505612

25. Patterson F, Knight A, Dowell J, Nicholson S, Cousans F, Cleland J. How effective are selection methods in medical education? A systematic review. Med Educ. 2016;50(1):36–60. doi:10.1111/medu.12817

26. Govaerts M, Van der Vleuten C, Schuwirth L, Muijtjens A. Broadening perspectives on clinical performance assessment: rethinking the nature of in-training assessment. Adv Health Sci Educ Theory Pract. 2007;12(2):239–260. doi:10.1007/s10459-006-9043-1

27. Gingerich A, Regehr G, Eva KW. Rater-based assessments as social judgments: rethinking the etiology of rater errors. Acad Med. 2011;86(Suppl):S1–S7. doi:10.1097/ACM.0b013e31822a6cf8

Creative Commons License © 2025 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms and incorporate the Creative Commons Attribution - Non Commercial (unported, 4.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]