Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PBF error: illegal blob size #275

Open
mfathi91 opened this issue Jun 25, 2024 · 5 comments
Open

PBF error: illegal blob size #275

mfathi91 opened this issue Jun 25, 2024 · 5 comments

Comments

@mfathi91
Copy link

mfathi91 commented Jun 25, 2024

What version of osmium-tool are you using?

osmium version 1.16.0 (v1.16.0)
libosmium version 2.20.0

What operating system version are you using?

NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"

Tell us something about your system

What did you do exactly?

My intention is to extract Delaware out of USA. These are the steps that I take to achieve this:

  1. Get boundaries of USA: osmium tags-filter --output USA.boundary.pbf --overwrite --no-progress USA.osm.pbf r/ISO3166-1:alpha3=USA
  2. Get boundaries of Delaware out of USA admin boundaries file: osmium tags-filter --output US-DE.boundary.pbf --overwrite --no-progress USA.boundary.pbf r/ISO3166-2=US-DE
  3. Create a config.json file for the extract command: config.json
  4. Extract Delaware out of USA using its boundaries: osmium extract --config config.json --option complete-partial-relations=65 --strategy smart --verbose --overwrite --no-progress USA.osm.pbf

These commands go through successfully and create the intended Delaware file, but the Delaware file seems to be corrupted. When I run osmium fileinfo -e Delaware.osm.pbf, it crashes showing the following error:

PBF error: illegal blob size
File:
  Name: Delaware.osm.pbf
  Format: PBF
  Compression: none
  Size: 650160326
Header:
  Bounding boxes:
  With history: no
  Options:
    generator=osmium/1.16.0
    pbf_dense_nodes=true
    pbf_optional_feature_0=Sort.Type_then_ID
    sorting=Type_then_ID

What did you expect to happen?

Running osmium fileinfo -e Delaware.osm.pbf should not crash saying PBF error: illegal blob size.

What did you do to try analyzing the problem?

At the beginning I suspected that the input data is corrupted, so I executed the following commands on USA.osm.pbf:

  • osmium fileinfo -e USA.osm.pbf
  • osmium check-refs USA.osm.pbf
    Both of the above command properly show the statistics, and report that there is no referential integrity issues. Also, it is noteworthy to mention that I am able to cut all the other US states, and all of them pass fileinfo and check-refs commands.

To summarize, my input pbf is valid (because of the two above commands), but after extract the output pbf becomes corrupted (because it cannot complete any of the above commands). Given the described situation, would you agree that there might be an issue in the osmium tool? Do you need me to send you some pieces of the data?

@joto
Copy link
Member

joto commented Jun 25, 2024

I can not reproduce this. I used the USA file from Geofabrik and did the steps you describe and the resulting file is fine.

I had to repair the config file you provided, it does not work otherwise. Are you sure the osmium extract command even ran? Maybe the broken Delaware.osm.pbf is left over from an earlier attempt?

The other thing: The Delaware.osm.pbf you seem to have is rather large, 650 MB. Mine is only 19 MB.

@mfathi91
Copy link
Author

mfathi91 commented Jun 25, 2024

The input PBF file that I have for USA is indeed enriched with other data sources. That's why it's bigger. Would it be ok for you if I upload a piece of USA that I'm working with somewhere, and send it to you to reproduce the issue? Shall I send the link privately to jochen@topf.org ? Otherwise I'd violate my company's data privacy.

Also what was the repair that you did on the config file?

Are you sure the osmium extract command even ran? Maybe the broken Delaware.osm.pbf is left over from an earlier attempt?

Yes, I'm pretty sure that osmium extract ran, since I see the output of osmium about extract.

@joto
Copy link
Member

joto commented Jun 25, 2024

Ah, you should have mentioned that you are working with proprietary data. If you want support for that, please contact me by email and I'll send you my consulting rates.

@mfathi91
Copy link
Author

mfathi91 commented Jun 26, 2024

@joto thank you for your answer.

If you want support for that, please contact me by email and I'll send you my consulting rates.

Ok. If we decide that we definitely must send our propriety data to you, I'll hand-over your consulting fee to my manager.

My suspicion is about can_add function in libosmium. Please see here:
https://github.com/osmcode/libosmium/blob/f88048769c13210ca81efca17668dc57ea64c632/include/osmium/io/detail/pbf_output_format.hpp#L362

In this line return size() < max_used_blob_size;, we compare if the current size of the blob is less than the max allowed blob size, and then we blindly add the entity to the blob, without checking if the entity to be added is not too big to exceed the limit.

In other words, we don't check the size() + sizeOfEntityToBeAdded < max_used_blob_size.

Is my understanding correct from the can_add function?

@mfathi91
Copy link
Author

Side note: as workaround, I changed the max_entities_per_block to 1000 (instead of the default 8000), compiled the osmium + libosmium, and now the PBFs that osmium extract of the custom osmium creates are valid (no longer illegal blob size).

Would you agree that this is a signal that the can_add function does not return always the correct value, as I described above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants