What question did this study set out to answer?

This study aims to validate and compare the performance of four prediction scores for mortality in hospitalized patients with COVID-19 pneumonia.

May 16, 2026Open Access

Comparison of the performance of four clinical prediction rules for mortality in patients with COVID-19

Key Points

This study aims to validate and compare the performance of four prediction scores for mortality in hospitalized patients with COVID-19 pneumonia.
Retrospective cohort study conducted on 1,074 hospitalized COVID-19 pneumonia patients with complete data; secondary analysis of prior data (March-December 2020).
Performance of four mortality prediction scores (ISARIC-4C, CALL, SEIMC, q-CSI) assessed using sensitivity, specificity, and AUROC.
Calibration analysis performed to confirm the reliability of scores despite noted lack of fit in Hosmer-Lemeshow test.
q-CSI score demonstrated the best discrimination for mortality with AUROC of 0.85 (95% CI: 0.83–0.87), outperforming ISARIC-4C (AUROC 0.82), SEIMC (0.78), and CALL (0.69) with p < 0.0001.
Direct comparison indicated q-CSI significantly better than ISARIC-4C (p = 0.0016).
Calibration metrics confirmed models had good scaling and centering despite some lack of fit indicated by HL test.

Abstract

Background Clinical prediction rules integrate clinical and laboratory variables to estimate outcomes, facilitating decision-making and optimizing resources, especially in high-demand settings. We aimed to validate and compare the performance of four mortality prediction scores -ISARIC-4C, CALL, SEIMC, q-CSI- in a Peruvian cohort of unvaccinated hospitalized COVID-19 pneumonia patients during the initial pandemic wave. Methods We performed a retrospective cohort study based on a secondary analysis of data from a previous study (March-December 2020). To ensure a robust and standardized head-to-head comparison, we utilized a complete-case analysis (n = 1,074). Selection bias was rigorously assessed by comparing the analytic sample with excluded patients. Each score’s performance was evaluated using sensitivity, specificity, predictive values, likelihood ratios, area under the receiver operating characteristic curve (AUROC), and robust calibration metrics, including the calibration intercept (α) and the calibration slope (β). The ISARIC-4C score was used as an international reference standard for benchmarking. Results Among 3,074 hospitalized patients, 1,074 had complete data for all four scores; no clinically significant differences were found between this group and excluded participants, indicating a representative sample. The cohort was mainly male (67.9%) with a median age of 59 years. The q-CSI score showed the best discrimination (AUROC 0.85, 95% CI: 0.83–0.87), significantly better than ISARIC-4C (0.82, 95% CI: 0.80–0.85), SEIMC (0.78, 95% CI: 0.75–0.81), and CALL (0.69, 95% CI: 0.66–0.72) (p 0.05), supporting their clinical reliability. Conclusions In this Peruvian cohort, the q-CSI score exhibited the best predictive performance and highest feasibility for in-hospital mortality among patients with COVID-19 pneumonia. While the HL test indicated a lack of fit, the analysis of the calibration α and β confirmed that the models are globally well-calibrated, supporting their utility for risk stratification. However, local adjustment is still necessary prior to clinical use in our setting. These findings provide a valuable baseline for resource optimization in resource-limited setting during pandemic waves.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Azañero-Haro et al. (Thu,) studied this question.

synapsesocial.com/papers/6a080a71a487c87a6a40c6dd https://doi.org/https://doi.org/10.1371/journal.pone.0348683

Bookmark

View Full Paper