Soil texture is recognised as one of the key physical properties of soil. Although traditional laboratory testing methods can determine soil texture information with high accuracy, they are often considered time-consuming and costly. To achieve rapid and accurate acquisition of soil texture information, this study proposes RVFM, a hybrid deep learning model designed for soil texture detection using microscopic images. The model integrates a CNN branch for extracting multi-dimensional texture features with a Transformer branch for capturing global positional information, fused via a cross-attention module. This architecture effectively captures microscopic distribution characteristics to estimate soil composition proportions. Experimental results demonstrate high precision, with prediction coefficients (R2) for sand, silt, and clay reaching 0.971, 0.954, and 0.931, respectively. Corresponding Root Mean Square Errors (RMSE) were recorded at 3.789, 2.842, and 2.780. The test results outperform those of other classical network models, and the model shows better fitting performance in generalisation tests, demonstrating certain practical value
Pan et al. (Thu,) studied this question.