What question did this study set out to answer?

This study aims to compare developmental scores between the Bayley-4 and Bayley-III assessments in very preterm children.

May 8, 2026

The Bayley-4 Versus the Bayley-III in Very Preterm Children at 24 Months’ Corrected Age

Key Points

This study aims to compare developmental scores between the Bayley-4 and Bayley-III assessments in very preterm children.
Data from the SuPreme Study, a prospective observational cohort of 1601 very preterm infants, was analyzed.
Assessment took place at 24 months corrected age using either Bayley-III or Bayley-4 tests.
A subgroup of 1034 participants provided Bayley scores for comparison.
Bayley-4 scores were found to be significantly lower than Bayley-III scores with mean differences: cognitive −10.3 (95% CI [−11.9 to −8.8]; P < .001), language −5.9 (95% CI [−7.6 to −4.1]; P < .001), motor −3.7 (95% CI [−5.2 to −2.2]; P < .001).
12.8% of children in the Bayley-4 group scored more than 2 standard deviations below the mean for cognitive scores compared to 4% in the Bayley-III group.
The adjusted relative risk ratio for identifying developmental delay was 4.0 using Bayley-4 compared to Bayley-III.

Abstract

OBJECTIVES The Bayley is widely used for developmental assessment of young children. Reports suggest that Bayley-III overestimates scores and underidentifies developmental delay. Hypothesis: Using preterm cohort data, Bayley-4 scores will be lower than Bayley-III, with higher rates of developmental delay. METHODS Data are from the SuPreme Study, a prospective observational cohort of 1601 very preterm infants born 2013–2020. Follow-up included assessment at 24 months corrected age using the Bayley edition in routine use at the time, Bayley-III or Bayley-4 (Bayley-4A P .001), language −5.9 (95% CI −7.6 to −4.1; P .001), motor −3.7 (95% CI −5.2 to −2.2; P .001. Cognitive scores more than 2 standard deviations below the mean occurred in 12.8% of Bayley-4A&NZ group vs 4% of the Bayley-III group. CONCLUSIONS Bayley-4A&NZ standard scores are lower than Bayley-III scores. Adjusted mean differences between scores are large. The findings apply across a broad range of developmental functioning. The clinical corollary is a higher proportion of children are identified with developmental delay using Bayley-4, particularly moderate/severe cognitive delay—adjusted relative risk ratio 4.0. More children will meet access criteria for early support services. Further implications are that research/audit data using different editions cannot be directly compared. Studies of other Edition-4 versions are warranted.

Bookmark

The Bayley-4 Versus the Bayley-III in Very Preterm Children at 24 Months’ Corrected Age

Key Points

Abstract

Cite This Study