Abstract Drones are a valuable tool for surveying birds. However, surveys are hampered by the costs of manually detecting birds in the resulting images. Researchers are using computer vision to automate this process, but efforts to date generally target a narrow context, such as a single habitat, and do not identify key attributes such as species. To address this, we collected a diverse dataset of drone‐based bird images from existing studies and our own fieldwork. We labelled the birds in these images, detailing their location, species, posture (resting, flying, or other), age (chick, juvenile, or adult), and sex (male, female, or monomorphic). To demonstrate the usefulness of this dataset, we trained a bird detection and identification computer vision model, compared its performance with manual methods, and identified the main predictors of performance. Thirty‐three researchers contributed 23 865 images, captured using 21 different cameras across 11 countries and all 7 continents. We labelled 4824 of these images, containing 49 990 birds from 101 species. Our model processed images 85 times faster than manual processing and achieved a mean average precision (mAP) of 0.91 ± 0.25 for detection and 0.65 ± 0.33 for classification of species, age, and sex. Performance was predicted by the similarity between test and train images (Estimate = 1.3248, P = 0.00021), the number of similar classes (Estimate = −0.0742, P = 0.0033), the number of train instances (Estimate = 0.0034, P = 0.1019), and the number of pixels on the bird (Estimate = 0.0002, P = 0.0462). Our drone‐based bird dataset is the most accurately labelled and biologically, environmentally, and digitally diverse to date, laying the foundation for future research. We provide it and the trained model open‐access and urge researchers to continue to work together to assemble datasets that cover broad contexts and are labelled with key conservation metrics.
Wilson et al. (Thu,) studied this question.