So we made some great success in detecting all the insects in our raw images.
Now we have thousands and thousands of photos of singular insects and the ability to create thousands upon thousands more! So the next step is classifying all these critters!
We have a basic gameplan. We can use something like BioCLIP, and train it for panamanian insects we are seeing.
The problem is before we can train the insect classifier, we need to create a LOT of ground truth data. We have taxonomists hired who are willing to go through the long, arduous work of looking through all those images and label them as taxonomically accurate as they can.
So for each image of each insect we need to be able to rapidly give it a label. so I’ve been looking for a way to label things for bioclip to use in a nice user-friendly way. I’ve seen many different labeling applications out there, but haven’t seen anything that would let you label hierarchically. And this is the key. We don’t care as much about ID-ing specific species, but we need to label things in a taxonomic hierarchy as best as possible (which is how BioCLIP works too). And we need to be able to do this in a quick, user-friendly way. It should have all the possible taxonomic labels from something like Tree-of-life already loaded up.
I imagine an interface where you load a folder, it groups images by a guessed at similarity (As you see many pics will be nearly identical as they are multiple photos of the same insect that hadn’t moved much), it pops up an individual pic, you have macros that let you quickly choose the Class, order, family, etc… as far as you can, or give it labels like other things like (smudge, dirt, wrong detection). It would probably have a set of most recent classifications too where you could rapidly use the same ID that might pop up over and over.
Another important thing is that this labeling system should work iteratively. That is someone taxonomically not that talented, like me, should be able to go through a bunch of images and perhaps just group them by class or order. And then a human or robot could go through and try to narrow down those classifications to further to family or genus levels, and a final “expert” could ID even further or just confirm the IDs.
It feels like there should be something out there that lets you do this, but I have been asking around and haven’t found anything. Most labelling software are for things like Yolo, where you are trying to get a basic flat ID on a thing within a specific location in a bigger image. Instead we want a hierarchical label on a whole image (no need to draw rectangles!)
but it’s looking like we might have to roll our own software!?!?