Digital twins in dairy systems require reliable behavioural inputs. We develop a video-based framework that detects and tracks individual cows and classifies seven behaviours under commercial barn conditions. From 4964 annotated clips, expanded to 9600 through targeted augmentation, we couple YOLOv11 detection with ByteTrack for identity persistence and evaluate SlowFast versus TimeSformer for behaviour recognition. TimeSformer achieved 85.0% overall accuracy (macro-F1 0.84) and real-time throughput of 22.6 fps on NVIDIA L4 hardware. Attention visualizations concentrated on anatomically relevant regions (head and muzzle for feeding and drinking; torso and limbs for postures), supporting biological interpretability. Structured outputs (cow ID, start–end times, durations, and confidence) enable downstream use in nutritional modelling and integration with 3D digital-twin visualization environments, establishing a robust behavioural perception and state-estimation component within a dairy digital-twin architecture. The pipeline delivers continuous, per-animal activity streams suitable for individualized nutrition, predictive health, and automated management, providing a practical foundation for scalable dairy digital twins.
Building similarity graph...
Analyzing shared references across papers
Loading...
Rao et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69c7724e8bbfbc51511e2a67 — DOI: https://doi.org/10.1038/s44433-026-00004-x
Shreya Rao
Eduardo Rosa Medina Garcia
Suresh Neethirajan
Dalhousie University
Building similarity graph...
Analyzing shared references across papers
Loading...