-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code Table Request - New attribute: individual count #4032
Comments
@acdoll @sharpphyl you may want to weigh in. |
No real objections, but the documentation would need to be clear on what this is (some random thing that's never going to get updated?!) and what it can do (within Arctos: nothing that I can see). |
We are definitely in favor of this.
Currently, the number of individual organisms in a lot is captured in 'lot count' - this is not passed on to the aggregators (nor should it be; per documentation lot count can describe the number of vertebrae in a box). E.g., https://arctos.database.museum/guid/DMNS:Inv:10020 has two shells in the lot (i.e. 2 individuals). But the GBIF record only reports 1 individual: |
I agree it's a useful concept, I just don't think this is a suitable place for it.
and/or
I don't think either one of those scenarios are approachable by themselves, much less in the combinations that would come to exist in an active collection. I think this would be much better as a catalog record attribute, even if that's not fully capable of dealing with the data in some fringe cases. (And it's pretty easy to avoid those situations if this kind of information is important.) |
this might get used here - #4033 Given that events end up as occurrences, I think this makes sense here. it is either that or as part of "specimen event". It is NOT in any way related to what parts are currently in or have been in the collection right now.
No because some records include actual distinct events that may or may not be about the same number of individuals.
If you have 20 lots with a different number of individuals from the same event then they all participated in the event and adding one count of individuals that is the sum of all the lot counts to that event should suffice? |
The thing is - no one HAS to use this and if it isn't there, we can just pass "1" as a default to dwc:individualCount. That seems potentially less worse than what we are doing now? |
I'm not sure that anyone who's dumping stuff into a lot over time is going to much care about this....
"We don't have that information" is kinda always a defensible position. "... and so we've made something up!", not so much. |
But we are making stuff up now! |
Andy has already described the problem for our collection. We do not have multiple collecting events in one record, so I can't speak to that. The difference between one individual and two individuals that Andy pointed out could be meaningful to a researcher. A stronger case can be made for micromollusks which can occur in large numbers which could be important to assess the health of the population, etc. DMNS:Inv:29549 of Caecum bipartitum has 276 shells (in a tiny gel cap). GBIF shows one individual. As long as the data flows to GBIF as "Individual Count" it doesn't matter to me where I put the number of specimens in the Arctos catalog record. We support Teresa's recommendation to include DWC Individual Count so that the aggregator records reflect the number of individuals found at that collecting event. |
@Jegelewicz Just checking when the Code Table Management team will meet to discuss this. I don't want this issue to drift into oblivion. |
That is not reflective of how the data are structured. |
@dustymc is there a solution you can suggest? We do need this resolved. |
|
conceptually count doesnt belong in the collecting event-- I agree with Dusty that it is a attribute of the cataloged record. If it's not getting passed on in the DwCA, then that's a mapping issue, not a CT or new thing for collecting event (which is location+date:time) |
We don't actually record this in a meaningful way anywhere, "part lot count" is not a usable value since we may have 3 parts from a single individual in a given catalog record.
Probably not - since multiple taxa can share a collection event, but it also does not belong as part of the catalog record either. The individual count expected at the aggregators is "The number of individuals present at the time of the Occurrence." What we are passing as "occurrences" are actually "specimen" events (please see #4036 because our terminology is all over the place and is also problematic). As discussed recently, using "specimen" events as an occurrence is problematic because we end up reporting two occurrences when there is only one. Here is an example: https://arctos.database.museum/guid/DMNS:Mamm:12344 BUT they are passed to the aggregators as separate events/individuals https://www.gbif.org/occurrence/1145096812 Careful consideration of associated occurrences and organism ID will suss this out, but it is a shame that we pass different organism IDs for each of these records. Even if we cleaned up our act and got them into the same collecting event, we would still be sending conflicting information. Anyhoo. It is probably true that we have no good way to say how many individuals of a particular taxon took part in any given OCCURRENCE (collecting or observation event). Ideas are welcome because sending 1 when there are 276 is a bit misleading. |
Correct - I magic it (poorly, probably) for some special circumstances, and there's some legacy not-quite-data from previous attempts of that hanging around. If we want to pass something meaningful on then we need to record it. (And I can magic - probably still poorly! - the initial values if needed.)
No, we are splitting catalog records at collecting events in an attempt to magick Occurrences out of the aether. What we are passing as Occurrences does not exist in Arctos; that's just not what gets cataloged. |
Mostly - but I think some records with observation type events are pretty close.
I think we are splitting them at "specimen" events - thus the seid? Honestly the quoted statement is true for all physical collections in the data aggregators, but after looking at this, I do think there are some things we could be doing better. So I guess I can go along with making this a collection object attribute even though it isn't really going to solve the whole problem. See updated request. |
Most are.
Same thing from the perspective of a single catalog record.
Always.
Nope, there are some ragged edges, but I think it does what the collections who seem to care about this want done without adding too much complexity or being too hard to understand in a decade or so.
That doesn't seem quite right, or complete, or something, but I'm struggling to come up with anything better. @sharpphyl help?? |
I fully support adding this as a collection object attribute. Is there some way we can represent count = unknown in a way that GBIF would ingest correctly? |
@dustymc how does "INDIVIDUALCOUNT" get calculated?
|
This should be caching properly, and filtering out to things that use the cache, now. Weird data - eg, multiple determinations - will break the cache, so there's a status on /guid/ pages in next release. The cache is just pulling the attribute values, so using anything other than 'individuals' for units, or any non-integer value with any units, will also do something interestingly fatal. There is no default; not providing this attribute will result in individualcount=NULL being send out with DWC. I can help bulkload initial data if necessary, just let me know how to calculate this. @sharpphyl your collection had nothing, if you'll let me know how to get the initial values I can magic them in. For collection_cde=Ento collections, the old code was using @mlbowser For collection_cde=Fish collections, the old code was using
@ebraker |
The number of individuals in a catalog record for DMNS:Inv is in the field Qty under Parts. If a record has both a shell and an operculum, each part will appear separately but the number of individuals doesn't increase. Thanks for magicing them in. Is this where we add the individual count attribute during data entry? I think we still add the count in the Qty field too so it shows as a "shell" or other part. Does that mess up anything? |
@sharpphyl I think that means you want sum(lot_count) - 1 in your first screenshot, 2 in the second? You should definitely continue to provide lot count - it's a completely different thing (and much more important, in my view). Yep that's one place to edit Attributes. @Jegelewicz the frontmatter on the parts doc page seems to be mangled and it's claiming you edited - fix? |
Thanks all for updating this. We don't have a good inventory yet of our specimens, it's all legacy numbers at the moment. I will keep this in mind for when we do an inventory! |
@sharpphyl some data for your review:
Let me know if it is as expected. |
No, both of these records have only one organism. The first has only the shell and the second has both the shell and its operculum. There may be a few records where these aren't the same Qty, but I can adjust them manually. So if they are the same, that's the number of individuals in the record. I looked at your csv and found specimens (e.g. https://arctos.database.museum/guid/DMNS:Inv:25570) that show 2 organisms where there is only one. Perhaps we should only count the number of shells if there is both a shell and an operculum and they have the same Qty. If they are different, we would use the larger quantity as the number or organisms. I checked a few records where the part name is exoskeleton, test or whole organism and I didn't find any issues. |
@sharpphyl I can't quite figure out how to interpret that. Maybe just max (rather than sum) lotcount works for a first pass? That seems to work for the few examples so far. Or below are your unique part name combos - maybe we can set this up as when partaggregate='shell|whole organism' then do_some_thing ?? Note that these are determinations, you can adjust them as necessary, and both bulk loaders and unloaders are available.
|
Give me a pointer so I can figure out where to go look? |
There's no sidenav thingee |
No sidenav thingee where? |
Let's see if this helps. Rule 1 - if there is only one part, use the value in Part Qty shell - use Qty operculum|shell - use the shell Qty only egg case|operculum|shell - use the shell Qty only - I only found one record for this - https://arctos.database.museum/guid/DMNS:Inv:22493 Are there others? shell|whole organism - sum the Qtys |
Apologies for not being able to follow this more closely and coming in with
a stupid question, but I thought that individual count could be a field to
use to record all the individuals in a lot, regardless of number of parts?
And ideally could be used for other taxa, eg fish, tadpoles, parasites?
…On Sat, Feb 19, 2022, 8:51 AM Phyllis Sharp ***@***.***> wrote:
* [EXTERNAL]*
Let's see if this helps.
Rule 1 - if there is only one part, use the value in Part Qty
Rule 2 - if there is a shell and an operculum, use the value in Part Qty
for "shell" as the default. Do not add the operculum Qty.
Rule 3 - sum certain part Qty values as listed below
shell - use Qty
test - use Qty
exoskeleton - use Qty
egg case - use Qty
egg - I changed this to egg case
whole organism - use Qty
operculum - use Qty
operculum|shell - use the shell Qty only
operculum|shell|shell - sum the shell Qtys only
operculum|shell|whole organism - sum the shell and whole organism Qtys only
egg case|operculum|shell - sum the egg case and shell Qtys only
shell|whole organism - sum the Qtys
shell|shell - sum the Qtys
test|whole organism - sum the Qtys
shell|test - sum the Qtys
exoskeleton|shell - sum the Qtys
egg case|shell - sum the Qtys
—
Reply to this email directly, view it on GitHub
<#4032 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBGBX3LIN5L6QRDHD6LU364A3ANCNFSM5GQXX5GQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@sharpphyl how's this? |
I corrected the organism count on five odd records that didn't fit my "rules." Highlighted in yellow. I also added 10 records at the very bottom uploaded since you ran this report. If it looks ok to you, let's do it. temp_dmnsinvic-2 - PMS edits.csv Thanks for making magic. |
That's exactly what it is. |
HI Teresa,
Attached is the mammal catalog. I am sending it to you because of the
misnaming of ACUNHC 1.
All the best
Tom
…On Mon, Feb 21, 2022 at 10:46 AM Teresa Mayfield-Meyer < ***@***.***> wrote:
I thought that individual count could be a field to use to record all the
individuals in a lot, regardless of number of parts?
And ideally could be used for other taxa, eg fish, tadpoles, parasites?
That's exactly what it is.
—
Reply to this email directly, view it on GitHub
<#4032 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVRLPTBB2OWDUNXL5PBEDLDU4JT5HANCNFSM5GQXX5GQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Goal
Accurately describe the number of individuals that participated in an occurrence per dwc:individualCount in order to pass appropriate information to aggregators.
Context
#3908 (comment)
Table
https://arctos.database.museum/info/ctDocumentation.cfm?table=ctattribute_type
Value
individual count
Definition
The number of individuals represented by this catalog record.
Attribute data type
number+units
Attribute value
integers
Attribute units
individuals
Priority
[ Please choose a priority-label to the right. ]
The text was updated successfully, but these errors were encountered: