Key points are not available for this paper at this time.
Background/Objective: Forensic document examination (FDE) traditionally relies on subjective expert opinion. This preliminary study was designed to develop and validate a hybrid deep learning model (ResNet-50 + bidirectional long short-term memory BiLSTM) for quantitative handwriting and signature feature analysis, and to compare its performance, under standardized experimental conditions, with that of three certified forensic document examiners. Methods: Handwriting and signature samples were collected from 225 individuals in a standardized setting. Fifteen quantitative handwriting features were extracted, the dataset was split into training (70%, n = 158) and testing (30%, n = 67) subsets using stratified random sampling, and ground truth for analytic categories was defined by majority consensus among the three examiners (with adjudicated review for disagreements). A hybrid architecture combining a ResNet-50 backbone and a bidirectional LSTM encoder was used. Results: The model demonstrated 93.4% accuracy, an F1-score of 0.926, and an AUC-ROC of 0.968 on the held-out test set. Under our task-specific experimental conditions, the model performed better than examiners on slant analysis (96.8% vs. 93.2%, p = 0.002), pressure profiling (94.1% vs. 91.7%, p = 0.019), and age estimation (87.4% vs. 82.1%, p = 0.011); examiners performed better on forgery detection (95.8% vs. 91.2%, p = 0.008) and signature verification (96.1% vs. 92.3%, p < 0.012). Mean processing time was reduced by 99.6% (0.8 s vs. 197 s per case). Conclusions: Within the limits of this preliminary single-centre study, the system showed performance comparable to certified examiners on several quantitative tasks and complementary strengths overall, supporting its feasibility as an adjunctive tool in a hybrid human–AI workflow. Broader, multi-centre validation and explainability work are required before any forensic deployment can be considered.
Can et al. (Wed,) studied this question.