-
-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance: store path in stack to avoid copying context #438
Conversation
Codecov Report
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #438 +/- ##
==========================================
- Coverage 76.02% 75.94% -0.09%
==========================================
Files 13 13
Lines 4692 4701 +9
==========================================
+ Hits 3567 3570 +3
- Misses 866 870 +4
- Partials 259 261 +2 |
Benchmarkpackage main
import (
"bytes"
"fmt"
"testing"
"github.com/goccy/go-yaml"
goyaml2 "gopkg.in/yaml.v2"
goyaml3 "gopkg.in/yaml.v3"
)
func prepareSampleData(line int) []byte {
var data bytes.Buffer
for i := 0; i < line/5; i++ {
data.WriteString(`- propertyA: valueA
propertyB: 2
propertyC: test
propertyD: example
propertyE: example2
`)
}
return data.Bytes()
}
type sampleData struct {
PropertyA string `yaml:"propertyA"`
PropertyB string `yaml:"propertyB"`
PropertyC string `yaml:"propertyC"`
PropertyD string `yaml:"propertyD"`
PropertyE string `yaml:"propertyE"`
}
func BenchmarkGoccyYAML(b *testing.B) {
lines := []int{100, 1000, 10000}
for _, line := range lines {
sample := prepareSampleData(line)
b.Run(fmt.Sprintf("%d lines", line), func(b *testing.B) {
b.Run("github.com/goccy/go-yaml", func(b *testing.B) {
var datas []sampleData
for i := 0; i < b.N; i++ {
err := yaml.Unmarshal(sample, &datas)
if err != nil {
b.Fatal(err)
}
}
})
b.Run("gopkg.in/yaml.v2", func(b *testing.B) {
var datas []sampleData
for i := 0; i < b.N; i++ {
err := goyaml2.Unmarshal(sample, &datas)
if err != nil {
b.Fatal(err)
}
}
})
b.Run("gopkg.in/yaml.v3", func(b *testing.B) {
var datas []sampleData
for i := 0; i < b.N; i++ {
err := goyaml3.Unmarshal(sample, &datas)
if err != nil {
b.Fatal(err)
}
}
})
})
}
} To compare before and after the fix, I have to use Result:
From the results, we can see:
Let me know if you need anything else 🙏 |
This looks like a good change to me. Comparing with 1.11.3 using a complex ~6MB sample:
And it looks okay compared to the last good version, 1.9.2:
|
Looks good to us also. See benchmarks in the related issue. |
@goccy would you mind to have a look on the PR? 🙏 Thank you! |
Update our fork with the performance patch from goccy/go-yaml#438. ``` goos: linux goarch: amd64 pkg: github.com/cerbos/cerbos/internal/parser cpu: 13th Gen Intel(R) Core(TM) i7-1360P │ parser_before.txt │ parser_after.txt │ │ sec/op │ sec/op vs base │ Unmarshal/10-16 3399.0µ ± 3% 847.2µ ± 4% -75.08% (p=0.000 n=10) Unmarshal/50-16 50.093m ± 7% 3.847m ± 2% -92.32% (p=0.000 n=10) Unmarshal/100-16 191.559m ± 6% 7.504m ± 3% -96.08% (p=0.000 n=10) Unmarshal/500-16 4674.56m ± 3% 35.09m ± 4% -99.25% (p=0.000 n=10) geomean 111.1m 5.412m -95.13% │ parser_before.txt │ parser_after.txt │ │ B/s │ B/s vs base │ Unmarshal/10-16 1.030Mi ± 3% 4.134Mi ± 4% +301.39% (p=0.000 n=10) Unmarshal/50-16 341.8Ki ± 6% 4423.8Ki ± 2% +1194.29% (p=0.000 n=10) Unmarshal/100-16 175.8Ki ± 6% 4506.8Ki ± 3% +2463.89% (p=0.000 n=10) Unmarshal/500-16 39.06Ki ± 0% 4794.92Ki ± 4% +12175.00% (p=0.000 n=10) geomean 223.1Ki 4.380Mi +1910.85% │ parser_before.txt │ parser_after.txt │ │ B/op │ B/op vs base │ Unmarshal/10-16 4258.8Ki ± 0% 384.6Ki ± 0% -90.97% (p=0.000 n=10) Unmarshal/50-16 91.248Mi ± 0% 1.813Mi ± 0% -98.01% (p=0.000 n=10) Unmarshal/100-16 358.297Mi ± 0% 3.596Mi ± 0% -99.00% (p=0.000 n=10) Unmarshal/500-16 7944.51Mi ± 0% 18.98Mi ± 0% -99.76% (p=0.000 n=10) geomean 181.3Mi 2.611Mi -98.56% │ parser_before.txt │ parser_after.txt │ │ allocs/op │ allocs/op vs base │ Unmarshal/10-16 10.595k ± 0% 8.978k ± 0% -15.26% (p=0.000 n=10) Unmarshal/50-16 50.64k ± 0% 42.93k ± 0% -15.23% (p=0.000 n=10) Unmarshal/100-16 100.83k ± 0% 85.38k ± 0% -15.32% (p=0.000 n=10) Unmarshal/500-16 505.9k ± 0% 425.6k ± 0% -15.87% (p=0.000 n=10) geomean 72.33k 61.17k -15.42% ``` Signed-off-by: Charith Ellawala <charith@cerbos.dev>
Update our fork with the performance patch from goccy/go-yaml#438. ``` goos: linux goarch: amd64 pkg: github.com/cerbos/cerbos/internal/parser cpu: 13th Gen Intel(R) Core(TM) i7-1360P │ parser_before.txt │ parser_after.txt │ │ sec/op │ sec/op vs base │ Unmarshal/10-16 3399.0µ ± 3% 847.2µ ± 4% -75.08% (p=0.000 n=10) Unmarshal/50-16 50.093m ± 7% 3.847m ± 2% -92.32% (p=0.000 n=10) Unmarshal/100-16 191.559m ± 6% 7.504m ± 3% -96.08% (p=0.000 n=10) Unmarshal/500-16 4674.56m ± 3% 35.09m ± 4% -99.25% (p=0.000 n=10) geomean 111.1m 5.412m -95.13% │ parser_before.txt │ parser_after.txt │ │ B/s │ B/s vs base │ Unmarshal/10-16 1.030Mi ± 3% 4.134Mi ± 4% +301.39% (p=0.000 n=10) Unmarshal/50-16 341.8Ki ± 6% 4423.8Ki ± 2% +1194.29% (p=0.000 n=10) Unmarshal/100-16 175.8Ki ± 6% 4506.8Ki ± 3% +2463.89% (p=0.000 n=10) Unmarshal/500-16 39.06Ki ± 0% 4794.92Ki ± 4% +12175.00% (p=0.000 n=10) geomean 223.1Ki 4.380Mi +1910.85% │ parser_before.txt │ parser_after.txt │ │ B/op │ B/op vs base │ Unmarshal/10-16 4258.8Ki ± 0% 384.6Ki ± 0% -90.97% (p=0.000 n=10) Unmarshal/50-16 91.248Mi ± 0% 1.813Mi ± 0% -98.01% (p=0.000 n=10) Unmarshal/100-16 358.297Mi ± 0% 3.596Mi ± 0% -99.00% (p=0.000 n=10) Unmarshal/500-16 7944.51Mi ± 0% 18.98Mi ± 0% -99.76% (p=0.000 n=10) geomean 181.3Mi 2.611Mi -98.56% │ parser_before.txt │ parser_after.txt │ │ allocs/op │ allocs/op vs base │ Unmarshal/10-16 10.595k ± 0% 8.978k ± 0% -15.26% (p=0.000 n=10) Unmarshal/50-16 50.64k ± 0% 42.93k ± 0% -15.23% (p=0.000 n=10) Unmarshal/100-16 100.83k ± 0% 85.38k ± 0% -15.32% (p=0.000 n=10) Unmarshal/500-16 505.9k ± 0% 425.6k ± 0% -15.87% (p=0.000 n=10) geomean 72.33k 61.17k -15.42% ``` Signed-off-by: Charith Ellawala <charith@cerbos.dev> --------- Signed-off-by: Charith Ellawala <charith@cerbos.dev>
@goccy can you please take a look at this PR? |
@yoelsusanto Thank you for your contribution. Also, Sorry for the late reply. |
Fixes #325
This PR is created to dramatically improve performance for go-yaml library especially when parsing larger YAML files.
This is the methodology that I used to arrive at the current solution:
go test
tool to create a CPU profile when doingparser.ParseBytes
.withChild
andwithIndex
are the two functions that dominated the CPU utilization.parser.copy
. This operation is performed repeatedly when parsing the YAML tokens.context.tokens
.I realized the implementation can be refactored further to adjust the way of using context, but I aim for minimum change in this PR. Please let me know if you have any suggestions 🙏 . Thank you!
Checklist: