Can't reproduce speed/memory benefits in benchmarks #124

lpar · 2022-09-06T18:06:51Z

I tried this library as a drop in stdlib replacement and found that with our data, it was slightly worse than stdlib in both memory and speed.

So I thought OK, benchmarks are highly dependent on the specific data used, I'll try with the sample data used by this project. To my surprise the results were even worse -- this library seems to use more memory than stdlib and perform more slowly.

Then I noticed your README benchmarks were with Go 1.16.2, so I tried with that. Same outcome.

I feel like I must be doing something really wrong, so I've put together a repo with the code and some of the benchmark stats I got at https://github.com/lpar/segmentio

Gobd · 2022-09-06T22:16:41Z

Same results for me. You can see this library outperforms with more small/medium size JSON. I'd consider the one you're including to be large.

lpar · 2022-09-07T13:54:47Z

Well, I used the test file from this library, so if that's the wrong size to show an improvement I don't understand why the README shows an improvement.

I just ran a test using the first example JSON file from json.org, which is 552 bytes. Results:

> go test -bench='BenchmarkUnmarshal.*' -benchmem -benchtime 10s -cpu 1
goos: linux
goarch: amd64
pkg: test/segmentio
cpu: Intel(R) Xeon(R) E-2276M  CPU @ 2.80GHz
BenchmarkUnmarshalStdlib    	 1858698	      6425 ns/op	    2872 B/op	      57 allocs/op
BenchmarkUnmarshalJSONiter  	 2701512	      4606 ns/op	    2872 B/op	      69 allocs/op
BenchmarkUnmarshalSegmentIO 	 1852296	      6508 ns/op	    2736 B/op	      52 allocs/op
PASS
ok  	test/segmentio	54.010s

Still slower.

Using the servlet example from json.org, 3.5kB:

BenchmarkUnmarshalStdlib    	  317342	     37320 ns/op	   13438 B/op	     252 allocs/op
BenchmarkUnmarshalJSONiter  	  467490	     25424 ns/op	   14432 B/op	     319 allocs/op
BenchmarkUnmarshalSegmentIO 	  284883	     40527 ns/op	   12998 B/op	     231 allocs/op

Using 58kB of country data from http://dumbdata.com/:

BenchmarkUnmarshalStdlib    	   18825	    642585 ns/op	  189312 B/op	    5994 allocs/op
BenchmarkUnmarshalJSONiter  	   24480	    487415 ns/op	  222730 B/op	    8091 allocs/op
BenchmarkUnmarshalSegmentIO 	   18086	    657242 ns/op	  183272 B/op	    5988 allocs/op

So whether it's <10kB, 10-100kB, 100-1000kB or over 1000kB, I can't get this library to be faster deserializing than stdlib.

achille-roussel · 2022-09-07T17:03:52Z

Thanks for opening this discussion!

I took a look at your benchmark code and noticed you are decoding the data into untyped variables (empty interface) which means that the unmarshaling routine will create map[string]interface{} values to represent JSON objects or []interface{} values for arrays. These cause a lot of heap allocations, and pressure on the garbage collector.

I haven't had time to run your benchmarks myself yet, but I would assume that most of the time is spent on memory management because of this, which would explain why the different libraries do not yield different results in those particular tests. You can verify this hypothesis by comparing CPU profiles of each benchmark.

I would recommend modifying the benchmarks to marshal/unmarshal into Go struct values so the libraries can make use of type information to optimize the operations.

lpar · 2022-09-07T18:05:26Z

Thanks, that makes sense. Using autogenerated structs and the small json.org example:

BenchmarkUnmarshalStdlib    	  508608	      9222 ns/op	    1000 B/op	      23 allocs/op
BenchmarkUnmarshalJSONiter  	 1637437	      2584 ns/op	     544 B/op	      27 allocs/op
BenchmarkUnmarshalSegmentIO 	 1879700	      1819 ns/op	     522 B/op	      12 allocs/op

I'm thinking it would be a good idea to put a note about this in the README.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't reproduce speed/memory benefits in benchmarks #124

Can't reproduce speed/memory benefits in benchmarks #124

lpar commented Sep 6, 2022

Gobd commented Sep 6, 2022

lpar commented Sep 7, 2022

achille-roussel commented Sep 7, 2022

lpar commented Sep 7, 2022

Can't reproduce speed/memory benefits in benchmarks #124

Can't reproduce speed/memory benefits in benchmarks #124

Comments

lpar commented Sep 6, 2022

Gobd commented Sep 6, 2022

lpar commented Sep 7, 2022

achille-roussel commented Sep 7, 2022

lpar commented Sep 7, 2022