Manta Matcher not segregating by coloration

AFlam · December 15, 2020, 9:41am

In which Wildbook did the issue occur? Manta Matcher

What operating system were you using? (eg. MacOS 10.15.3) MacOS 10.15.6

What web browser were you using? (eg. Chrome 79) Chrome

What is your role on the site? (admin, researcher, etc) admin

What happened? Matches using the PIE algorithm are not segregating by coloration. I was searching for a normally colored manta, and a melanistic manta showed up in the results (both mantas were correctly labeled in their encounters), then I tested this with the melanistic manta, and normally colored mantas appeared.

What did you expect to happen? Manta matches should be segregating by coloration (normal and leucistic should appear together as leucism can be difficult to determine, but melanism is obvious).

What are some steps we could take to reproduce the issue? Run a search on any melanistic manta to see normally colored mantas in the results (there should only be melanistic mantas). This is the result for a search I ran on TH0283B

https://www.mantamatcher.org/iaResults.jsp?taskId=c504c23d-b75e-48b6-bec6-3eeed5081d92

If this is a bulk import report, send the spreadsheet to services@wildme.org with the email subject line matching your bug report

jason · December 15, 2020, 2:59pm

Hi Anna,

Interesting point.

PIE is a deep learning algorithm, and such, it actually learns how to extract individual patterning on its own from its provided training data. It may be focusing on a smaller part of the image than expected and a different set of features than anticipated. In seeking to maximize differences among individuals, it may have learned to ignore broader body coloration (e.g., too many or too few melanistic mantas in the training data) and just focus on smaller patches of the body a means of differentiating individuality.

At the end, PIE training produces an uninspectable machine learning model based on how it learned to maximize accuracy in a training set versus test and validation sets of matched IDs. We can’t really look at how much or little melanistic plays a role.

In a future retraining (we should work with you to define how often we retrain PIE and on what subset of extremely clean data), it may improve results to have PIE variants for significantly obvious coloration differences, if we can rely on the user (or another ML step) to make that decision for us. That may or may not improve accuracy overall.

It is not clear in the example you provided that PIE should have found a match. Are you seeing PIE match individuals where there is a match?

Thanks,
Jason

AFlam · December 15, 2020, 3:17pm

Ok, good to know. I was expecting it to segregate for that parameter based on the user-defined data.

For those two mantas I mention, they each only have one sighting so there wouldn’t have been a match (but I was impressed by the pattern similarity in the results – it looks better than what would show up in the old algorithm).

I have been finding matches with both PIE and hotspotter and often with some fairly iffy photos, so they’re performing well based upon my initial (limited, due to lack of diving/IDs) tests.