Skip to content
This repository was archived by the owner on Mar 27, 2020. It is now read-only.

superposed nodes when importing Polygons/MultiPolygons and closed LineStrings, and area=yes/no OSM tags #12

Open
verdy-p opened this issue Oct 12, 2017 · 2 comments
Labels
Hacktoberfest https://hacktoberfest.digitalocean.com

Comments

@verdy-p
Copy link

verdy-p commented Oct 12, 2017

GeoJSON does not detail nodes in the "coordinates" array embedded in geometries for Polygons (single area possibly with holes)/MultiPolygons(simple array of Polygons), or LineStrings (possibly closed)/MultiLineStrings(simple array of LineStrings).

The import tool currently creates a single new OSM node for each coordinate pair, including for the first and last [x,y] pair of every ring member of a Polygon (which are implicitly closed), or of any polygon member of a MultiPolygon, when they are the same (this is not a requirement for rings of Polygons/MultiPolygons as they are necessarily closed to represent surfaces, and if the last coordinates pair in a ring is not the same as the first one, the rings MUST still be closed as it it was there by as if the missing last pair was present).

Because coordinates pairs are compeltely untagged in GeoJSON geometries (except when they are in a feature whose geometry is only a single node), it makes no sense to not merge the duplicate nodes coming from the same geometry object (or the same "coordinates": properties of any imported "Feature" type).

Please merge these nodes, and make sure that ring members in Polygon (first ring for the outer border, 2nd and following rings for the optional inner borders) and MultiPolygons are effectively closed:

This closure is implicit in GeoJSON, where a triangular polygon like this one is still valid and closed:

{"type":"Polygon","coordinates":[
  [[x0,y0], [x1,y1], [x2,y2]]
]}

and equivalent to:

{"type":"Polygon","coordinates":[
  [[x0,y0], [x1,y1], [x2,y2], [x0,y0]]
]}

which should still be imported as 3 OSM nodes (not 4) and a OSM closed way connecting the 3 nodes.

As well this tetragonal area with an inner triangular hole (sharing a common node between outer and inner rings):

[x3,y3]                       [x2,y2]
       +---------------------+
        \                   /
         \[x4,y4]  [x5,y5] /
          \   +---+       /
           \  |  /       /
            \ | /       /
             \|/       /
              +-------+
       [x0,y0]         [x1,y1]

represented in GeoJSON with this geometry:

{"type":"Polygon","coordinates":[
  [[x0,y0], [x1,y1], [x2,y2], [x3,y3]],
  [[x4,y4], [x5,y5], [x0,y0]]
]}

or equivalently as:

{"type":"Polygon","coordinates":[
  [[x0,y0], [x1,y1], [x2,y2], [x3,y3], [x0,y0]],
  [[x4,y4], [x5,y5], [x0,y0], [x4,y4]]
]}

which should be imported as 6 OSM nodes (not 9 because [x0,y0] occurs 3 times, and [x4,y4] occurs 2 times), two OSM ways (one for each ring), and a OSM relation with "type=multipolygon" whose members will be the two rings, the first one with "outer" role, the second with "inner" role.

In contrast, the following "similar" tetragonal and triangular areas:

[x3,y3]                       [x2,y2]
       +---------------------+
        \                   /
         \                 /
          \               /
           \             /
            \           /
             \         /
              +-------+
       [x0,y0]|\       [x1,y1]
              | \
              |  \
              +---+
          [x4,y4]  [x5,y5]

represented in GeoJSON as:

{"type":"MultiPolygon","coordinates":[
  [
    [[x0,y0], [x1,y1], [x2,y2], [x3,y3]]
  ],[
    [[x5,y5], [x0,y0], [x4,y4]]
  ]
]}

or equivalently as:

{"type":"MultiPolygon","coordinates":[
  [
    [[x0,y0], [x1,y1], [x2,y2], [x3,y3], [x0,y0]]
  ],[
    [[x5,y5], [x0,y0], [x4,y4], [x5,y5]]
  ]
]}

will be imported as two closed ways, but no OSM "multipolygon" relation, unless they are part of a "Feature" with properties (in which case both closed ways will be members with "outer" role), and there will also be 6 OSM nodes only (not 9): the generated nodes should also be merged as they are part of the same geometry object (and necessarily have the same tags in the generated OSM feature, but OSM nodes themselves for this geometry will have no tags at all).

Notes: in strict GeoJSON, the outer rings should be traced anticlockwise and the inner rings should be clockwise. As well rings should not intersect except on isolated nodes with supplied coordinate pairs; outer rings should not overlap any non-zero surface; inner rings should be fully contained in the outer ring which is the first member of the polygon, and inner rings should not overlap themselves any non-zero surface. It is still valid (but not recommanded in GeoJSON) for a Multipolygon to have shared borders (but it is permitted if they are part of separate geometries in distinct Features with possibly separate GeoJSON properties, i.e. separate OSM tags). Such strict validation constraints do not need to be enforced when importing GeoJSON, but the JOSM validator may detect these. These constraints should be checked when exporting/saving from JOSM to a new GeoJSON, because GeoJSON applications may depend on these constraints for correct handling or rendering.


Also for Polygon/MultiPolygon, the imported OSM feature should also have tag "area=yes" as polygons in GeoJSON are necessarily representing surfaces, not just their border. This is different from GeoJSON "LineString/MultiLinestring", where the implicit OSM tag should be "area=no".

This would allow saving/exporting from JOSM to GeoJSON with the correct GeoJSON type: currently the imported Polygons/MultiPolygons will be saved incorrectly as "Linestring" or "MultiLineStrings" (the export does not figure out correctly that OSM closed ways or relations represent only borders or the surface, the surface interpretation should be assumed for OSM relations with "type=multipolygon" or "type=boundary", or for OSM closed ways with "landuse=" or "natural=" or "building=" or "water=", unless they have "area=no", or for any closed way with "area=yes").

When exporting to a geojson, the OSM tag "area=yes/no" should be discarded from the properties of the exported feature, it will just be used to select if a [Multi]LineString or [Multi]Polygon will be generated.

@verdy-p verdy-p changed the title superposed nodes when importing Polygons/MultiPolygons and closed LineStrings superposed nodes when importing Polygons/MultiPolygons and closed LineStrings, and area=yes/no OSM tags Oct 12, 2017
floscher added a commit that referenced this issue Nov 15, 2018
Even if the first and last node are not identical, the resulting OSM ways will loop back to the start node when importing from GeoJSON.

This should solve most of #12 (#12). Polygons and Multipolygons are now always imported as closed ways, LineStrings only if first and last node have the same coordinate.
@floscher
Copy link
Member

@verdy-p Thank you for raising this issue. It has been a while, but starting with 16038bd , part of your points should be solved.

Polygons and Multipolygons will now always be imported as closed ways. LineStrings will only be imported as closed ways if first and last node have the same coordinate.

@verdy-p
Copy link
Author

verdy-p commented Nov 15, 2018

Ok, now there remains to solve the correct interpretation of imported rings, to determine if it generates simple OSM closed ways (rings), or create them as members of a single OSM multipolygon relation: this is not so simple, because you also need to determine their OSM role ("inner" or "outer") by checking which rings are enclosing others.

The import could fail if they intersect and the common surface is not 0% or 100% (such geometry is normally invalid also in GeoJSON).

But it should NEVER fail if the intersection between pairs of rings represents a surface of 0% or 100% of any ring, even if there are common/superposed nodes, or common/superposed line segments. If there are common segments, the import in OSM should try splitting them on identical nodes, and then remove duplicate segments, before trying to determine the new rings, and then their "inner" or "outer" roles.

The import would be safer if simply the GeoJSON was just parsed as a collection of independant segments, and then a loop will detect if they intersect somewhere else than just their two end-points: if there's such intersection, these segments would be split on the computed intersection points (non colinear case), or on the existing endpoints of the other segment.

Then a second pass will eliminate all pairs of segments (defined as unordered set of two points, which can be normalized by creating an ordered pair, where ordering is the point index in a separate map of points)

Basically:

  • the algorithm first starts by creating an empty PointMap of (x,y) to integer and an PointArray of (x,y): each new point detected is first looked up in the PointMap; if it is found then it uses the point number returned for the segment endpoint. Else it adds a new (x,y) pair in the PointArray, and stores its integer position in the Map.
  • it also creates an initial SegmentMap of (pointNumber a, pointNumber b) to integer and a SegmentArray of (pointNumber a, pointNumber b): a and b are determined from the previous PointMap, but then ordered so that a < b (if a==b) then you don't need to store any segment.
  • For each GeoJSON polygon parsed, it uses composes it into the SegmentMap (and its associated SegmentArray): each time a segment to add is present (only compared point numbers), it is removed, else it is added. At end, all segments are different.
  • Then it will look for all pairs of segments in the SegmentArray to detect if they intersect using the (x,y) coordinates of points: if they are colinear they will be split only on endpoints and no new point will need to be added in the PointMap and array, but if they are not colinear, you need to determine the intersection point of straight lines passing through each segment, and see if it falls at the in the middle of one of the two segments, then it will split that segment into two parts, by first adding a new point (using the PointMap to determine a point number to use in the PointArray) if needed. Each new segment if checked in the SegmentMap to see if it must be added, or removed.
  • At this step, the SegmentMap and PointMap are no longer needed, only the SegmentArray and PointArray are useful.
  • Then it will start looking up for closed rings by forming chains of segments (in SegmentArray) to form chains, using a priority order when there ares several more than two segments sharing the same point number, if there's a even number of segments sharing the same node (as start or end node).

To do that it will be useful to first a new PointSegmentsMap from a point number (in PointArray) to a vector of segment numbers (in SegmentArray); this will allow to detect points that have "multiplicities" (size of their vector) higher than 2 which are the only ones that require special checks:

  • All nodes with multiplicity equal to 1 are start or end points of a future non-closed polyline.
  • All nodes with multiplicty equal to 2 are intermediate nodes of a future non-closed polyline or closed ring.
  • All nodes with higher multiplicity are at intersection of two future polylines or closed ring: you need to sort the vectors found in PointVectorsMap for these node by direction so they all rotate in the same angular direction (you don't need to compute actual angles, this can be done arithmetically by computing the tangent and cotangent of actual angles, by just using arithmetic on their coordinates and even without using any division; this can be done in the compare(segmentnumber a, segmentnumber b) function used by a simple quick sort of PointSegmentsMap[pointnumber].

Then create a new empty LinesArray (you don't know which polyline will be a closed ring before they are completed) and a new RingsArray. the by just scanning points in PointArray from top to bottom, taking and removing only one segment (e.g. the last one) from its associated PointSegmentsMap[n] vector and trying to connect it to the first or last node of existing lines in LinesArray.

If when you detect that the polyline is closed by the additional segment, remove the polyline from the set of Polylines, and add it to a separate RingsArray, otherwise add a point to the existing polyline.

At this step you may want to simplify the geometry of polylines and rings by merging segments that are colinear and in the same direction. Finally in the simplified array of rings, if this caused any ring to become completely "flat" (i.e. only two segments remaining in it), this ring can be eliminated from the set of rings.

At this step you have two sets: LinesArray and RingsArray. This is almost finished: one will become a set of OSM polylines (which can be put in a separate OSM Multipolygon relation with area=no, or just a single unclosed way if there's only one), the other will become a a single OSM Multipolygon relation with area=yes (or a single closed way, if there's only one).

But you need to determine the roles of each ring in the second OSM relation (area=yes): rings have to be first oriented (e.g. anticlockwise) so they become comparable (note: all these rings are such that each pair have now 0% or 100% surface intersection) : you need to sort the rings so that if ring A is covered at 100% by ring B, then A will be sorted after B, and pairs of rings that have 0% intersection are sorted in random order: you just need to create a map of ring numbers to other rings which are covering 100% of their surface (note: coverage does not take into account nodes or segments, only what is inside): you may just keep a counter of how many other rings are intersecting at 100%, because ONLY the even/odd parity of this number determines if this is an "outer" role (even counter) or "inner" (odd counter).

But you may also want to order these rings so that the outermost ones (with "inner" or outer" role in OSM) will always come before the others contained in them (with "inner" or "outer" roles): this ordering is equivalent to sorting the set of rings by decreasing counter values (instead of just keeping the even/odd info): the counter values are not unique (so this is a partial sort), but the relative order of rings with identical counters does not matter, as they cannot have one containing 100% of the other.

This algorithm can then be used to import arbitrary GeoJSON geometry(polylines, multipolylines, polygons, multipolygons), including invalid ones, it will generate at most:

  • 1 OSM unclosed way, or 1 OSM multipolygon relation with area=no containing only unclosed ways with empty roles (and a minimum number of unclosed ways: those can be interconencted will always be interconnected as a single unclosed way, none of them have common starting or ending node)
  • 1 OSM closed way, or 1 OSM multipolygon relation with area=yes containing only closed ways with roles "inner" or "outer" (each pair of them are intersecting only a 0% or 100% common surface, none of them have common segments, they can only have common isolated nodes)

You may choose to merge the two multipolygons into a single one without specifying the area="yes/no" attribute, given that ways are distringuished (by their closed or unclosed geometry, or by the role "inner" or "outer" assigned to those that are closed). In that case it will remain only 1 OSM parent object containing everything, with always a valid geometry and then you can map it as a single feature in OSM when it was a single Feature in the GeoJSON !

The same algorithm can be used to perform the reverse transform (from OSM to GeoJSON, but you need an additional step of ordering segments in rings in "anticlockwise" or "clockwise" or direction depending if they are determined to be respectively "outer" or "inner".

The same algorithm can also be used to transform an invalid GeoJSON geometry into a valid one, or an invalid OSM geometry into a valid one: the algorithm performs all the necessary corrections (using rules similar to those used in the SVG property "fill-rule: evenodd" for filling SVG paths)

The same algorithm can also be used in the OSM editor itself to simplify the geometry of any OSM relation containing ways with roles "outer" or "inner", or tagged as "area=yes", and reordering their members, then merging successive ways part of the reordered rings that have the same attributes.

  • It can become a UI action in the JOSM relation editor (more powerful than the existing "sort" button).
  • If the algorithm is implemented to work only on OSM relations, then the simplest way to import a GeoJSON to OSM is to just perform a basic mapping of GeoJSON geometries to a new OSM multipolygon, and then apply the algorithm on it: if there remains only one way in it, keep only that way and you don't need the relation. Then, just import properties of each GeoJSON features, as OSM tags of that single remaining OSM object.

@don-vip don-vip added the Hacktoberfest https://hacktoberfest.digitalocean.com label Oct 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Hacktoberfest https://hacktoberfest.digitalocean.com
Projects
None yet
Development

No branches or pull requests

3 participants