-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't reproduce speed/memory benefits in benchmarks #124
Comments
Same results for me. You can see this library outperforms with more small/medium size JSON. I'd consider the one you're including to be large. |
Well, I used the test file from this library, so if that's the wrong size to show an improvement I don't understand why the README shows an improvement. I just ran a test using the first example JSON file from json.org, which is 552 bytes. Results:
Still slower. Using the servlet example from json.org, 3.5kB:
Using 58kB of country data from http://dumbdata.com/:
So whether it's <10kB, 10-100kB, 100-1000kB or over 1000kB, I can't get this library to be faster deserializing than stdlib. |
Thanks for opening this discussion! I took a look at your benchmark code and noticed you are decoding the data into untyped variables (empty interface) which means that the unmarshaling routine will create map[string]interface{} values to represent JSON objects or []interface{} values for arrays. These cause a lot of heap allocations, and pressure on the garbage collector. I haven't had time to run your benchmarks myself yet, but I would assume that most of the time is spent on memory management because of this, which would explain why the different libraries do not yield different results in those particular tests. You can verify this hypothesis by comparing CPU profiles of each benchmark. I would recommend modifying the benchmarks to marshal/unmarshal into Go struct values so the libraries can make use of type information to optimize the operations. |
Thanks, that makes sense. Using autogenerated structs and the small json.org example:
I'm thinking it would be a good idea to put a note about this in the README. |
I tried this library as a drop in stdlib replacement and found that with our data, it was slightly worse than stdlib in both memory and speed.
So I thought OK, benchmarks are highly dependent on the specific data used, I'll try with the sample data used by this project. To my surprise the results were even worse -- this library seems to use more memory than stdlib and perform more slowly.
Then I noticed your README benchmarks were with Go 1.16.2, so I tried with that. Same outcome.
I feel like I must be doing something really wrong, so I've put together a repo with the code and some of the benchmark stats I got at https://github.com/lpar/segmentio
The text was updated successfully, but these errors were encountered: