Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-1 coordinate precision should not be interpreted as 1 #9

Closed
cgendreau opened this issue Feb 7, 2017 · 7 comments
Closed

-1 coordinate precision should not be interpreted as 1 #9

cgendreau opened this issue Feb 7, 2017 · 7 comments

Comments

@cgendreau
Copy link
Contributor

-1 coordinate precision should not be interpreted as 1

Should we not simply discard, rather than alter that?
Looking at the record diagnostic view, it seems "-1" is used for NULL


fbitem-occurrence1135583752
Reported by: @timrobertson100
System: Chrome 55.0.2883 / Mac OS X 10.11.5
Referer: https://demo.gbif.org/occurrence/1135583752
Window size: width 1721 - height 1099
API log
Site log
datasetKey: 38b4c89f-584c-41bb-bd8f-cd1def33e92f
publishingOrgKey: b8323864-602a-4a7d-9127-bb903054e97d

@cgendreau
Copy link
Contributor Author

The reason is that we take the absolute value and check if it's between 0 and 45.

@timrobertson100
Copy link
Member

That seems like a flawed algorithm to me - are there examples suggesting any negative value is trustworthy? Would it not be better to err on the side of caution i.e. omit suspicious values rather than assert incorrect ones?

@cgendreau
Copy link
Contributor Author

Are negative values are trustworthy? I don't know, maybe not, I would say it is probably handled that way in the sense of a ± "precision".
1 or -1 would mean 1 decimal degree of precision.

@timrobertson100
Copy link
Member

The output of hive> SELECT v_coordinateprecision, count(*) FROM prod_b.occurrence_hdfs GROUP BY v_coordinateprecision; is attached. There is a lot of noise in this field and 41,523,350 records having -1.

I am not entirely sure what to do now, other than suggest the DwC documentation needs revision (CC @tucotuco who might be interest in this issue)

v_coordinateprecision.txt

@tucotuco
Copy link

coordinatePrecision should be null or greater than zero, with only very few viable values. Anything else suggestion a misunderstanding of the concept.

Does this help?

tdwg/dwc-qa#23

@cgendreau
Copy link
Contributor Author

yes, thanks @tucotuco
I think it would worth updating the definition and/or have at least a reference to the very rich explanation included in tdwg/dwc-qa#23

@tucotuco
Copy link

Interesting suggestion. The practical issue with changing Darwin Core term definitions is that it must pass through the onerous ratification process. Because of this a) definitions should be engineered to persist, and b) content, such as examples and comments, should not be a part of the definition that needs to be ratified.

There is an open proposal to remove recommendations from the Darwin Core term definitions and put them into the comments. I don't see it documented, but the idea was also to have the comment separate from the normative definition, making it much easier to change with no ratification process. All of this awaits the ratification of the TDWG Vocabulary Management standard.

There is also an open issue to replace the GBIF Media Wiki method of supplementary material management. I believe the Darwin Core Question and Answer wiki to be a viable solution for that. I have opened an issue to recover this documentation to the Darwin Core Question and Answer wiki.

cgendreau pushed a commit that referenced this issue Mar 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants