The University of Melbourne
Browse
label-field-dataset.zip (785.62 MB)

Field bounding boxes for institutional labels on herbarium specimen sheets

Download (785.62 MB)

This dataset contains the bounding boxes for text fields on institutional labels on herbarium specimen sheets. The annotations are in YOLO format. It contains the following classes:

0. genus
1. species
2. year
3. month
4. day
5. family
6. collector
7. authority
8. locality
9. geolocation
10. collector_number
11. infrasp_taxon


These classes were annotated on 3,642 images of institutional labels from 10 herbaria. 2,603 images are from the University of Melbourne's herbarium (MELU) and the remainder are from the nine herbaria represented in the benchmark dataset described by Dillen (10.3897/BDJ.7.e31817). The images are in subdirectories by the code of the respective herbarium. The institution corresponding to each code is:

MELU: The University of Melbourne
BR: Meise Botanic Garden
K: Royal Botanic Gardens, Kew
BM: Natural History Museum, London
B: Botanic Garden and Botanical Museum, Berlin
E: Royal Botanic Garden Edinburgh
P: National Museum of Natural History, Paris
TU: Natural History Museum, University of Tartu
L: Naturalis Biodiversity Center
H: Finnish Museum of Natural History LUOMUS, University of Helsinki


These were broken down into 2,887 training images and 755 validation images and these are listed in train.txt and valid.txt respectively.

For more information, see https://github.com/rbturnbull/hespi

History

Add to Elements

  • Yes