To the Editor, We read with interest the randomized trial by Zhang et al1 entitled “Effect of remimazolam on postoperative delirium in elderly patients undergoing major abdominal surgery: a randomized controlled Trial.” The use of blinded assessors with twice-daily validated assessments and reporting of both intention-to-treat (ITT) and per-protocol (PP) analyses are notable strengths. While postoperative delirium (POD) incidence did not differ significantly between groups PP: 17.1% vs 19.7%; odds ratio (OR) 1.189, 95% confidence interval (CI) 0.688–2.057, and intraoperative hypotension appeared less frequent with remimazolam, we seek clarification on several points to aid interpretation, particularly given the variability of POD incidence across cohorts (10% reported in elective major abdominal surgery)2. First, although the trial was designed as a non-inferiority study (margin 5%, one-sided α = 0.025), the results are largely presented in terms of “no significant difference” based on P-values. For non-inferiority trials, interpretation hinges on the prespecified estimand and its CI relative to the margin. We suggest explicitly reporting the between-group risk difference and its 97.5% confidence interval in both ITT and PP populations, and stating clearly whether the upper confidence bound lies within the 5% margin. A simple figure showing the point estimate and confidence interval against the margin would further aid readers. Second, in the ITT population, baseline imbalances were reported for education level and diabetes – both plausibly related to cognitive reserve and POD risk. Although subsequent analyses identified education, American Society of Anesthesiologists (ASA) grade, age, and Mini-Mental State Examination (MMSE) as prognostic factors, it would be helpful to assess whether these imbalances materially affect the treatment estimate. We therefore suggest a covariate-adjusted sensitivity analysis (e.g., regression including education, diabetes, age, ASA grade, and MMSE) and reporting the adjusted effect alongside the primary analysis, particularly given an overall POD incidence of around 20%. This request is supported by cohort evidence linking education and perioperative vulnerability to POD risk2–4 (college education: adjusted OR 0.35, 95% CI 0.13–0.91; ASA ≥3: OR 2.0, 95% CI 1.0–3.9). Third, the prespecified subgroup finding in patients aged ≥75 years is notable, with a significant interaction term (P for interaction <0.001) suggesting a higher POD risk with remimazolam. To fully appreciate the clinical relevance of this signal, readers would benefit from seeing the absolute POD rates and the corresponding absolute risk difference within this specific subgroup. Furthermore, as subgroup analyses are inherently sensitive to event counts and multiple testing, cautious interpretation is warranted. Age thresholds show strong associations with POD risk in prospective cohorts; for example, age ≥75 years independently increased POD risk (adjusted risk ratio (RR) 2.54, 95% CI 1.11–5.80) in a noncardiac surgery cohort5. Clarifying the interaction in this high-risk demographic is crucial for safe clinical application. Fourth, anesthetic depth was guided by bispectral index (BIS) targets (40–60). The manuscript appropriately acknowledges the uncertainty regarding BIS interpretation under remimazolam. If BIS values do not correspond equivalently to hypnotic depth across different agents, maintaining “similar BIS” may not ensure comparable cortical suppression. This has profound implications for both delirium risk assessment and the interpretation of hemodynamic outcomes. Data on total hypnotic drug doses or alternative depth-of-anesthesia metrics, if available, could provide valuable context. If BIS values under remimazolam reflect a different hypnotic state than under propofol, maintaining “similar BIS” might have resulted in disparate actual anesthesia depths, potentially confounding the comparison of delirium risk. Finally, methodologically, the handling of missing data through last observation carried forward (LOCF) and hot-deck imputation in ITT analyses, along with the exclusion of a notable proportion of patients (22/370) from PP analyses due to additional medications, raises questions about the robustness of the findings. The sensitivity of the primary conclusion to these assumptions, as well as the context of real-world co-interventions, would be valuable to discuss. Additionally, while POD assessment through postoperative day 5 captures most early cases, later-onset delirium associated with complications, ICU transfer, or sleep disruption might be missed, particularly with earlier discharge. This represents a recognized limitation in many POD studies. Moreover, hemodynamic “variability” (not merely the presence of hypotension) has been linked to POD in cohort evidence; for instance, mean arterial pressure variability was associated with POD severity (MAP variance OR 1.038 per 10 units, 95% CI 1.013–1.065) in a prospective cohort6. This trial offers valuable prospective evidence on the safety profile of remimazolam in elderly surgical populations. Addressing the aforementioned points would not only clarify the interpretation of the current data but also better guide the design of future trials and the optimization of anesthetic strategies for this novel agent in vulnerable geriatric patients. Ethical approval Not applicable. Consent Not applicable.
Zeng et al. (Fri,) studied this question.