Abstract Efforts to expedite accurate identification of environmental plastics pollution have strong focus on machine learning (ML) techniques. We published two Raman spectroscopy datasets to support the development of next-generation ML methods. The Raman spectra for plastics identification (RaSPI) dataset presents 402 high-quality Raman spectra with <1 cm −1 resolution between 100 and 4000 cm −1 . RaSPI spans 14 plastic types and has variability in (unknown) additives. The Raman maps for plastics identification dataset (RaMPI) contains 34 two-dimensional spectroscopic maps containing 33,119 spectra. RaMPI spectra offer <1 cm −1 resolution across the fingerprint region, with significant variability in signal:noise ratios that is useful for methodology testing and validation. Both datasets contain data from pristine samples and from environmental pollution. Spectra across both datasets have been manually assigned as one of 14 different plastic classifications. The consistency and quality of these datasets make them high-value resources for researchers active in diverse topics, including training ML models for microplastics research, for developing spectroscopic processing algorithms, or for those seeking datasets to test their methodologies against.
Hogan et al. (Mon,) studied this question.