Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete Stream leads to data loss #127

Open
j-bbr opened this issue May 8, 2021 · 3 comments
Open

Complete Stream leads to data loss #127

j-bbr opened this issue May 8, 2021 · 3 comments

Comments

@j-bbr
Copy link

j-bbr commented May 8, 2021

I am reading from an OSM-PBF File with a filter. CallingToComplete() leaves only nodes. All ways and relations that are present in the non-materialized stream are removed. Maybe related to #123 . I can provide a repro project in case you're interested.

@xivk
Copy link
Contributor

xivk commented May 10, 2021

Can you reproduce this still? For example in a unittest?

@j-bbr
Copy link
Author

j-bbr commented May 11, 2021

Here is a Test (all playgrounds in Berlin) that fails:

Test Output
----Normal----
OsmSharp.Node 1021
OsmSharp.Way 3031
OsmSharp.Relation 25
----Complete----
OsmSharp.Node 1021

So it seems no completed ways or relations are returned at all

public class OsmSharpTest
    {
        public async Task<string> DownloadedFile()
        {
            var httpClient = new HttpClient();
            var downloadStream = await httpClient.GetStreamAsync("https://download.geofabrik.de/europe/germany/berlin-latest.osm.pbf");
            var filePath = Path.GetTempFileName();
            await using var fileStream = File.OpenWrite(filePath);
            await downloadStream.CopyToAsync(fileStream);
            return filePath;
        }

        private IEnumerable<object> GetItemsFromFile(string path, bool complete)
        {
            using var source = new PBFOsmStreamSource(new FileInfo(path).OpenRead());
            
            var includeTags = new []{new Tag ("leisure", "playground")}.ToImmutableHashSet();
            var filteredItems = source
                .Where(item => item.Tags.Any(tag => includeTags.Contains(tag)));
            return complete ? filteredItems.ToComplete().ToArray() : filteredItems.ToArray();
            
        }

        private async Task LogTypeCounts(IEnumerable<object> objects)
        {
            foreach (var type in objects.GroupBy(o => o.GetType())) 
                await TestContext.Progress.WriteLineAsync($"{type.Key} {type.Count()}");
        }
        
        [Test]
        public async Task Complete_Read_Test()
        {
            var downloadedPath = await DownloadedFile();
            var normalObjects = GetItemsFromFile( downloadedPath,false);
            var completeObjects = GetItemsFromFile(downloadedPath,true);
            File.Delete(downloadedPath);
            await TestContext.Progress.WriteLineAsync("----Normal----");
            await LogTypeCounts(normalObjects);
            await TestContext.Progress.WriteLineAsync("----Complete----");
            await LogTypeCounts(completeObjects);
            Assert.AreEqual(normalObjects.Count(), completeObjects.Count());
            
        }
    }

@j-bbr
Copy link
Author

j-bbr commented May 25, 2021

Any idea on what could be the issue here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants