Skip to content
This repository has been archived by the owner on Mar 15, 2024. It is now read-only.

Features that are rarely created in the real world #131

Open
bkowshik opened this issue Mar 30, 2017 · 10 comments
Open

Features that are rarely created in the real world #131

bkowshik opened this issue Mar 30, 2017 · 10 comments

Comments

@bkowshik
Copy link
Contributor

Ref: #112


There are features that are created not often in the real world. Ex: It is not everyday that a new airport is constructed. So, if these features are rare in the real world, shouldn't they also be rare on OpenStreetMap. Should we they flag these rare features for manual review?

screen shot 2017-03-30 at 12 43 04 pm

A harmful aeroway=aerodrome created by a new user in Georgetown

What are the other features are rarely created in the real world? 💭


cc: @planemad @manoharuss @geohacker @amishas157

@bkowshik
Copy link
Contributor Author

railway track from no where

screen shot 2017-04-11 at 12 11 06 pm


cc: @srividyacb

@amishas157
Copy link
Contributor

amishas157 commented Apr 26, 2017

The idea is wonderful @bkowshik 👌 . Yes, features which are rare in real world should be rare on OSM and not created every now and then. And if these incidents happen, we should definitely 👀 these.
Why not let's take this ahead. So have few thoughts around how can we go about this.

What are the other features are rarely created in the real world? 💭

Following is the list which I could think of rare and critical features present in OSM.

  • Admin boundaries
  • Ocean
  • Continents
  • Countries
  • Mountains
  • Airports

Feel free to add any thing which would make sense for the compare function.

Also, have couple of questions which I am looking forward to along with bringing the compare function. This is entirely to understand these features history and not act as a blocker / prerequisite step for this compare function.

  • Frequency of such feature creation (only version1) over the time
  • Geographical distribution of these features
  • Number of these features created by new mapper (Little difficult to do sanity check)

Next actions:

  • To find all OSM tags and values corresponding to the features in the above list and and write a compare function which detects whenever any new feature is created out of this list.

@amishas157
Copy link
Contributor

WIP: #157

@willemarcel
Copy link
Collaborator

It's common Mapsme users map their home as castle, for example: https://osmcha.mapbox.com/48163958/
I think we could add castles to this list of uncommon features.

@poornibadrinath
Copy link
Contributor

I took a look at a few changesets today and the pattern I noticed was the changesets that are currently flagged are all of mountain tags. A few things that I had in mind when I was going through these changesets were:

Point 1:
Are we gonna flag every detail being added?
A few changesets are having name tags being added. Would it be valuable to flag them?

Point 2:
Is flagging mountain ranges important? Many times, they are mapping the ones that are not mapped. The ones I have noticed so far has been like that. Can we tweak it a bit to generate less noise.

Point 3:
If the mountain keys are removed, it is very rare that you get any changesets regarding rare and critical features. Any other features that are really critical and less added to be included in this list?

As Wille suggested, random castles being added on the map has a higher priority. Also, are view points and monuments important? Right now they are being tagged on maps.me filter. But would it make sense it ramp it up a bit. Are they critical and rarely added?

Question:
Admin boundaries and Oceans changes occur regularly. When someone adds a new node that touches the admin boundary or when someone modifies the coastline, it gets flagged. What are the specific tags that are priority for admin boundary and oceans.

Also, one feature to look out for are beaches. There are lots of cases where beach tag is added randomly. Will that tag qualify for something rare?

I am going to continue looking out for the changesets of this comparator, just so we will get a solid idea of how we want to improve this. Will add notes on the findings as and when.

/cc: @manoharuss @amishas157 @bkowshik

@bsrinivasa
Copy link

Admin boundaries and Oceans changes occur regularly. When someone adds a new node that touches the admin boundary or when someone modifies the coastline, it gets flagged. What are the specific tags that are priority for admin boundary and oceans.

@poornibadrinath
We don't update either of them in Mapbox streets regularly. (either-Admin boundaries and coast line). So we would guess this is not a priority and should be removing from our detections/flagging them as suspicious. Is this correct @manoharuss

@poornibadrinath
Copy link
Contributor

@bsrinivasa agreed. Since they are not updated regularly on Mapbox Streets and since changes on them do occur frequently on OSM, flagging them for rare and critical features would make no sense. Until there is some value in finding changesets regarding boundaries and oceans, we can remove them from the suspicious list.
/cc: @manoharuss

@krishnanammala
Copy link
Contributor

We don't update either of them in Mapbox streets regularly. (either-Admin boundaries and coast line). So we would guess this is not a priority and should be removing from our detections/flagging them as suspicious. Is this correct @manoharuss

Yes I agree we are not worried about admin boundaries and coastlines as they doesn't show up on Streets. Thats the reason we have removed them from the straight detector too.

A few changesets are having name tags being added. Would it be valuable to flag them?

This is also similar to case of above ^^ . Refer ticket https://github.com/mapbox/osm-quarantine/issues/256

cc @poornibadrinath @manoharuss

@amishas157
Copy link
Contributor

Thanks a lot @poornibadrinath for digging into this.
Couple of queries and answers to your questions:

Are we gonna flag every detail being added?
A few changesets are having name tags being added. Would it be valuable to flag them?

This compare function is meant to catch the newly created features and not deletion and modification of old ones. There whichever satisfies these filter and are newly created will be caught.

Is flagging mountain ranges important? Many times, they are mapping the ones that are not mapped. The ones I have noticed so far has been like that. Can we tweak it a bit to generate less noise.

Mountains became part of this compare function for two main reasons:

  • It's a rare chance that important mountain ranges (For example Himalayas) will be added now and then in OSM.
  • They does affect rendering on maps.
    But yes, the biggest challenge is to distinguish important mountain ranges from small peaks based on OSM tags. But may be we can fetch the list of important mountain ranges from wikipedia and flag as they are added in OSM. This would be very similar to our osm-landmarks workflow.

If the mountain keys are removed, it is very rare that you get any changesets regarding rare and critical features. Any other features that are really critical and less added to be included in this list?

Yes, we need to figure those features. Happy to extend this list if anything comes up and looks like that satisfies the objective of this compare function.

As Wille suggested, random castles being added on the map has a higher priority. Also, are view points and monuments important? Right now they are being tagged on maps.me filter. But would it make sense it ramp it up a bit. Are they critical and rarely added?

Yes, I agree that random castles added on map which is a rare kind of feature but not critical. Therefore looks like it does not fall in the scope of this compare function. But we can definitely take a stab at it in a separate compare function ?

Question:
Admin boundaries and Oceans changes occur regularly. When someone adds a new node that touches the admin boundary or when someone modifies the coastline, it gets flagged. What are the specific tags that are priority for admin boundary and oceans.

I would like to clarify here that this compare function won't flag if a admin boundary or ocean is modified or deleted, but will only flag when either of them is created (version = 1)

Also, one feature to look out for are beaches. There are lots of cases where beach tag is added randomly. Will that tag qualify for something rare?

Yes, that tag would identify as both rare and critical. But the problem which could arise is same as that of mountains. As in there is no clear distinction between important and not so important beaches. There are two options which we can look forward to:

  • Get an important list of beaches from wikipedia and check when any of it added.
  • Per voice w/ @poornibadrinath , got to know that people add beaches tag to small water bodies nit near even the sea. Therefore to catch that, we can think of context based detection, As a beach is supposed to be added only near the sea , Maybe we can look around the features nearby it and based on that decide its level of suspiciousness.

I am going to continue looking out for the changesets of this comparator, just so we will get a solid idea of how we want to improve this. Will add notes on the findings as and when.

🙇 Let me know something is unclear in above. ✌️

cc @bkowshik @manoharuss

@poornibadrinath
Copy link
Contributor

poornibadrinath commented May 26, 2017

This is super detailed! @amishas157, thanks for the explanation. Next actions here would be:

  • To find a way to differentiate between mountain ranges that are super important and some minor hills tagged as mountain_ranges.

But may be we can fetch the list of important mountain ranges from wikipedia and flag as they are added in OSM. This would be very similar to our osm-landmarks workflow.

This looks like a good place to start

  • To differentiate between beaches in the similar way ^^
  • To think of other features that comes under the scope of this feature detector.

Will be happy to help you on this :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants