Danish Fungi 2020 (DF20) is a fine-grained dataset and benchmark. The dataset, constructed from observations submitted to the Danish Fungal Atlas, is unique in its taxonomy-accurate class labels, small number of errors, highly unbalanced long-tailed class distribution, rich observation metadata, and well-defined class hierarchy. DF20 has zero overlap with ImageNet, allowing unbiased comparison of models fine-tuned from publicly available ImageNet checkpoints.
The dataset has 1,604 different classes, with 248,466 training images and 27,608 test images.
The dataset with 409,679 images belonging to 772 snake species from 188 countries and all continents (386,006 images with labels targeted for development and 23,673 images without labels for testing). In addition, we provide a simple train/val (90% / 10%) split to validate preliminary results while ensuring the same species distributions. Furthermore, we prepared a compact subset (70,208 images) for fast prototyping. The test set data consists of 23,673 images submitted to the iNaturalist platform within the "first four months of 2021.
All data were gathered from online biodiversity platforms (i.e., iNaturalist, HerpMapper) and further extended by data scraped from Flickr. The provided dataset has a heavy long-tailed class distribution, where the most frequent species (Thamnophis sirtalis) is represented by 22,163 images and the least frequent by just 10 (Achalinus formosanus).