Enabling nearshore cross-modal video object detector to learn more accurate spatial and temporal information | Synapse