This exploratory study examines the perception and production of the aspiration contrast in Mandarin voiceless retroflex affricates zh tʂ and ch tʂh by ten adult Spanish speakers (three Peruvian, seven Chilean) at Nanjing University. Participants completed a perception identification task and a production reading task using the same set of 128 syllables. Voice Onset Time (VOT) measurements from the production task were converted to binary classifications for cross-modality comparison. Perception accuracy was moderately high (zh tʂ: 84.43%; ch tʂh: 82.39%), whilst production accuracy was substantially lower (zh tʂ: 32.61%; ch tʂh: 19.15% within native VOT ranges). Participants maintained the aspiration contrast (zh tʂ = 58 ms, ch tʂh = 125 ms) but consistently underproduced VOT compared to native speakers (zh tʂ = 67 ms, ch tʂh = 164 ms). Perception patterns align with Category Goodness (CG) assimilation within PAM-L2: both Mandarin sounds map to Spanish tʃ but with different goodness-of-fit, enabling moderate discrimination. Production follows SLM-r predictions, with learners developing a Composite L1–L2 Category that maintains the aspiration contrast but fails to establish new phonetic categories. The small sample size (n = 10) precluded robust statistical testing of individual differences. The perception–production asymmetry supports independent modality development in L2 phonetic acquisition.
Roque et al. (Thu,) studied this question.