Do I import data on the right way

Stjepan · October 22, 2021, 9:12pm

What Wildbook are you working in?
Whiskerbook

Can you describe what the issue is you’re experiencing?

I am not sure if I am using programs on the right way. 2 questions.

Can you provide steps on how to reproduce what you’re experiencing?

1.I imported cc 150 photos of different quality and angle of knowing lynxes (that is, data I had, around 30 lynxes) - A1. Then I imported around 300 unidentified photos of lynxes (with time, and location ID) - A2. Is it better from the machine learning standpoint to import less pictures?
I tested both of these guys pois’ (A1, A2) with 28 pictures that we know which lynxes there are but we put them in the program as unidentified.
Can this unidentified photo (A2) aggravate possibility of identification because there are more unidentified photos that identified photos.

With that being said, I am fully aware that some of wrong identification is the result of not even having pictures of lynxes from all angles.

Can we even compare the success of identification in different animals. Maybe in my case with the other spotty animals but I’m not sure?

Thanks,
Stjepan

jason · November 3, 2021, 7:55pm

Hi @Stjepan

Is it better from the machine learning standpoint to import less pictures?

No, more is better.

Can this unidentified photo (A2) aggravate possibility of identification because there are more unidentified photos that identified photos.

Better matches of the same individual in the unidentified batch (A2) can rank above poorer photos from the identified batch (A1), but generally the correct match from A1 should appear in the results set as well, assuming you have an identified photo of the same side in A1 as A2 and both are of good quality.

Can we even compare the success of identification in different animals. Maybe in my case with the other spotty animals but I’m not sure?

With enough data for identified individuals sighted more than once, we can run a performance analysis of HotSpotter on your species (Eurasian lynx). We can this a top-k analysis, where k is the rank of the correct match in the list of potential matches. It looks something like this (but this is for sperm whales):

Does this help answer your question

Thanks,
Jason

Stjepan · November 8, 2021, 11:46pm

HI @jason
Thank you for your reply. You helped a lot
You also encouraged more questions

Is there a minimum recommended number / data required to make reliable k-analysis?

If I understood correctly, this k-analysis cannot be done with a command through “Wildbook”?

Thanks,
Stjepan

jason · November 9, 2021, 12:23am

Correct, a top-k analysis is done by our machine learning staff.

Generally, we like to have over 100 individuals with 2 or more photos of at least one side before doing a top-k analysis.

Stjepan · November 11, 2021, 12:50am

HI @jason
Thank you for your fast reply.