The present study investigated the extent to which human perception of surface materials and their physical properties depend on low-level and high-level image statistics. Using 200 images of real-world surfaces from 10 material categories, along with their corresponding Portilla-Simoncelli synthesized images preserving low-level statistics and style-synthesized images preserving feature statistics in CNN, we examined whether material categorization performance, ratings of surface properties, and material decoding accuracy from visual evoked potentials (VEPs) differed between the synthesized and original images. We found that the data were remarkably similar between the original and style-synthesized images, in comparison with the previous results for natural objects and scenes. Neural style information exhibited error patterns similar to those in behavioral categorization performance and showed a high correlation with VEPs at specific latencies. The results support the idea that material perception in the real world is largely determined by statistical features.
Oshima et al. (Fri,) studied this question.