By Andre Poremski
LepSnap is getting closer to launching the iOS version (Android in August). Fieldguide is using the network “Inception v3” model, which is currently the best performer on the ImageNet challenge and the one generally recommended in the TensorFlow tutorials & by Grant Van Horn at Visipedia. This network is built for and trained on the ImageNet data, but the difference here is that Fieldguide is retraining it on their own dataset. This initializes the network weights to those found after training on ImageNet but then continues to adjust them to give better results on our own data. This is much more efficient than training completely from scratch as it requires more than 1000 images per category, and months of GPU time to run the training.
Normally this network has 1000 output classes but by retraining the network on Fieldguide’s own dataset of ~550,000 lep images (~95% of which are field images), they can increase the number of classes. For example, it is possible to have >100,000 classes. However there is a minimum limit on the number of images (training samples) per class and the number of images per class is important. We must have at least 20 images per class but >100 is recommended. In order to achieve this, we are training the classifier at the Species level only for those categories that contain sufficient samples, and for those that do not meet the minimum bar, LepSnap will stop at the Genus level.
The training process requires 10% of the images to be used for validation during training and 10% for testing post-training, so only 80% are actually used to train the network. The training process and validation process run side by side and the training is stopped once the validation accuracy plateaus. Currently it takes ~23hrs to reach a validation accuracy of 86%. LepSnap will retrain the network one per week, taking into account new and reclassified images.
There are three remaining tasks the Fieldguide team are working on concurrently to ready LepSnap for launch:
1) Fieldguide is hooking this new neural network up to the backend API so they can start running this CV classifier directly on LepSnap. One issue is that, because the output of the classifier is now a category id rather than an image, LepSnap can’t show the most similar image inside the category as the category’s “cover image” (the image that is used for comparison between the query image and each CV suggestion). We are solving this problem with task #2.
2) Fieldguide is training a secondary neural network that focuses exclusively on extracting image similarity within classes. This network will be used to select the category cover images that are the best visual match (in each class/ID recommendation) to the query image (the image that is seeking identification).
3) The team is in the process of importing a large quantity of new training data (~300K more moth/butterfly images, which will include all LepNet images). After all available images are used to train the neural net, Fieldguide will generate a report that lists all species that have insufficient data to be utilized as a species-level classifier (stops at Genus level). That will provide us all with a “hit list” to prioritize those species in our digitization efforts.