-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: remove earthly usage for benchmarks #654
Conversation
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (2)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including You can disable this status message by setting the WalkthroughThis pull request introduces changes to the performance testing infrastructure by adding the Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (5)
test/performance/justfile (1)
10-10
: Consider making timeout configurable.The 60-minute timeout is hardcoded, which might not be suitable for all benchmark scenarios.
Make it configurable:
- -timeout 60m \ + -timeout {{timeout}} \And add it to the recipe parameters with a default:
-run bench='.' p='1' benchtime='1s' count='1' ledger-url='' output='./report/benchmark-output.txt': +run bench='.' p='1' benchtime='1s' count='1' ledger-url='' output='./report/benchmark-output.txt' timeout='60m':test/performance/charts/index.js (1)
42-50
: Consider consolidating graph export configuration.The current implementation repeats similar configuration for different graphs. This could be simplified using a configuration object.
Consider this refactor:
const main = () => __awaiter(void 0, void 0, void 0, function* () { let buffer = fs.readFileSync('../report/report.json', 'utf-8'); let reports = JSON.parse(buffer); - yield (0, graphs_1.exportTPSGraph)({ - output: '../report/tps.png', - }, reports); - yield (0, graphs_1.exportDatabaseStats)('../report/database_connections.png', reports); - const ps = ['P99', 'P95', 'P75', 'Avg']; - for (let p of ps) { - yield (0, graphs_1.exportLatencyGraph)({ - output: '../report/' + p.toLowerCase() + '.png' - }, p, reports); - } + const graphs = [ + { type: 'tps', fn: graphs_1.exportTPSGraph, filename: 'tps.png' }, + { type: 'db', fn: graphs_1.exportDatabaseStats, filename: 'database_connections.png' }, + ...['P99', 'P95', 'P75', 'Avg'].map(p => ({ + type: 'latency', + fn: graphs_1.exportLatencyGraph, + filename: p.toLowerCase() + '.png', + metric: p + })) + ]; + + for (const graph of graphs) { + yield graph.fn( + { output: `../report/${graph.filename}` }, + graph.metric || reports + ); + } });test/performance/charts/src/graphs.js (2)
82-82
: Consider adding error handling for parseFloat.While the changes correctly handle the numeric conversion of latency metrics, consider adding error handling for potential invalid numeric values.
- data: result[script].map(r => parseFloat(r.Metrics.Time[key].substring(0, r.Metrics.Time[key].length - 2))), + data: result[script].map(r => { + const value = parseFloat(r.Metrics.Time[key].substring(0, r.Metrics.Time[key].length - 2)); + if (isNaN(value)) { + console.warn(`Invalid numeric value for ${script} at key ${key}`); + return 0; + } + return value; + }),Also applies to: 90-90
118-190
: Consider refactoring for improved maintainability.The implementation effectively visualizes database connection metrics with a helpful max connections annotation. Consider these improvements:
- Extract the scope string as a configurable parameter
- Create helper functions for metric finding logic
- Add error handling for missing metrics
-const exportDatabaseStats = (output, result) => __awaiter(void 0, void 0, void 0, function* () { - const scope = 'github.com/uptrace/opentelemetry-go-extra/otelsql'; +const OTELSQL_SCOPE = 'github.com/uptrace/opentelemetry-go-extra/otelsql'; + +const findMetric = (scopeMetrics, scope, metricName) => { + const scopeMetric = scopeMetrics.find(sm => sm.Scope.Name === scope); + if (!scopeMetric) { + throw new Error(`Scope ${scope} not found`); + } + const metric = scopeMetric.Metrics.find(m => m.Name === metricName); + if (!metric) { + throw new Error(`Metric ${metricName} not found in scope ${scope}`); + } + return metric.Data.DataPoints[0].Value; +}; + +const exportDatabaseStats = (output, result, scope = OTELSQL_SCOPE) => __awaiter(void 0, void 0, void 0, function* () {test/performance/README.md (1)
Line range hint
13-40
: Documentation updates look good but could be enhanced.The command replacements from earthly to just are consistent and well-documented. Consider adding:
- A brief explanation of why Just is preferred over Earthly
- Links to Just documentation for new users
- Examples of comparing benchmark results across runs
Add the following sections to improve documentation:
## Why Just? Just provides a simpler and more maintainable way to run benchmarks compared to Earthly. It offers better integration with local development workflows and faster execution times. ## Comparing Results To compare benchmark results across different runs: ```shell just compare "run1.txt" "run2.txt"For more information about Just, see the Just Command Runner documentation.
</blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used: CodeRabbit UI** **Review profile: CHILL** **Plan: Pro** <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between b3eb0cdf39d49260db9690d9f15055c246c3906e and db8fa2fb70216cd2cf8e0082d77a51b279999fb7. </details> <details> <summary>⛔ Files ignored due to path filters (1)</summary> * `.github/workflows/benchmark.yml` is excluded by `!**/*.yml` </details> <details> <summary>📒 Files selected for processing (7)</summary> * `flake.nix` (1 hunks) * `test/performance/Earthfile` (0 hunks) * `test/performance/README.md` (2 hunks) * `test/performance/charts/index.js` (1 hunks) * `test/performance/charts/index.ts` (1 hunks) * `test/performance/charts/src/graphs.js` (4 hunks) * `test/performance/justfile` (1 hunks) </details> <details> <summary>💤 Files with no reviewable changes (1)</summary> * test/performance/Earthfile </details> <details> <summary>⏰ Context from checks skipped due to timeout of 90000ms (1)</summary> * GitHub Check: Tests </details> <details> <summary>🔇 Additional comments (3)</summary><blockquote> <details> <summary>flake.nix (1)</summary> `104-104`: **Verify goperf package availability and version.** The goperf package has been added to the development shell, but its version is not specified. Let's verify the package: </details> <details> <summary>test/performance/charts/src/graphs.js (2)</summary> `15-20`: **LGTM: Clean initialization of Chart.js with annotation plugin.** The changes properly set up the annotation plugin which is required for the new database connection metrics visualization. --- `33-33`: **Verify data structure compatibility after earthly removal.** The property access changes from lowercase to uppercase suggest a change in the benchmark output format. Ensure that the new Just-based workflow consistently provides the data in the expected format. Let's verify the data structure in the benchmark output: Also applies to: 41-41 <details> <summary>✅ Verification successful</summary> <!-- [monologue] [task verify_data_structure_1/1] [observations] 1. Found type definitions and usage patterns across the codebase: - In report_test.go: `TPS` is defined as float64 field in struct - In report.ts: `TPS: number` in interface Report - In graphs.ts/js: Consistent usage of `r.TPS` and `r.Configuration.Name` 2. Data flow: - Benchmark outputs TPS metrics (benchmark_test.go) - Data is saved to report.json (index.ts reads it) - Charts use the same capitalization in both TypeScript and JavaScript files 3. Key findings: - The property names are consistently capitalized across the codebase - Type definitions match the usage in the charting code - No mixed-case usage found, suggesting the change is intentional and consistent [analysis] The property access changes are consistent with the type definitions and usage patterns throughout the codebase. The uppercase property names are used consistently in both the data generation (Go) and visualization (TypeScript/JavaScript) parts of the codebase. This indicates the changes are correct and align with the codebase's conventions. [actions] Generate a verification response confirming the consistency of the data structure. --> **Property access changes are consistent with codebase** The uppercase property names (`TPS`, `Configuration.Name`) match the type definitions and usage patterns across both the benchmark output generation and visualization code. </details> <details> <summary>🏁 Scripts executed</summary> The following scripts were executed for the analysis: Script: ```shell #!/bin/bash # Description: Check the structure of benchmark output to ensure it matches the new format # Search for benchmark result parsing logic rg -A 5 'TPS|Configuration.Name' test/performance/
Length of output: 9905
@@ -6,15 +6,15 @@ const main = async () => { | |||
let buffer = fs.readFileSync('../report/report.json', 'utf-8'); | |||
let reports = JSON.parse(buffer); | |||
await exportTPSGraph({ | |||
output: 'tps.png', | |||
output: '../report/tps.png', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Directory creation check is required
The codebase lacks directory existence verification before file operations. The suggested change to create the '../report' directory is necessary to prevent runtime errors.
- Location:
test/performance/charts/index.ts
- Add directory creation before file operations:
if (!fs.existsSync('../report')) {
fs.mkdirSync('../report', { recursive: true });
}
🔗 Analysis chain
Verify report directory existence before file operations.
While the path changes are consistent, ensure the '../report' directory exists before writing files to prevent runtime errors.
Consider adding directory creation:
const main = async () => {
+ // Ensure report directory exists
+ if (!fs.existsSync('../report')) {
+ fs.mkdirSync('../report', { recursive: true });
+ }
let buffer = fs.readFileSync('../report/report.json', 'utf-8');
Also applies to: 12-12, 17-17
🏁 Scripts executed
The following scripts were executed for the analysis:
Error: No shell scripts found
Script:
#!/bin/bash
# Check the content of the test file
cat test/performance/charts/index.ts
# Search for directory creation patterns in related files
rg -l "mkdirSync.*report" --type ts
# Look for fs operations in test setup
rg "fs\.(mkdir|existsSync)" --type ts
Length of output: 778
Script:
#!/bin/bash
# Check for test runner configuration
fd -e json package.json --exec cat {}
# Look for test setup files
fd -e ts -e js setup test/performance --exec cat {}
Length of output: 650
test/performance/justfile
Outdated
git clone --depth 1 -b main https://github.com/formancehq/ledger {{tmpdir}} | ||
go test -run ^$ -tags it,local -report.file \ | ||
./report/report.json \ | ||
-timeout 60m \ | ||
-bench={{bench}} \ | ||
-count={{count}} \ | ||
-p {{p}} \ | ||
-test.benchtime {{benchtime}} \ | ||
{{tmpdir}}/test/performance | tee -a ./report/benchmark-output-main.txt | ||
benchstat ./report/benchmark-output-main.txt ./report/benchmark-output-local.txt > ./report/benchmark-comparison.txt || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling in comparison workflow.
The current implementation has two issues:
- Git clone failures are not handled explicitly
- Benchstat errors are silently suppressed with
|| true
Consider this improved error handling:
compare bench='.' p='1' benchtime='1s' count='1' output='./report/benchmark-output.txt':
just run {{bench}} {{p}} {{benchtime}} {{count}} '' './report/benchmark-output-local.txt'
- git clone --depth 1 -b main https://github.com/formancehq/ledger {{tmpdir}}
+ git clone --depth 1 -b main https://github.com/formancehq/ledger {{tmpdir}} || exit 1
go test -run ^$ -tags it,local -report.file \
./report/report.json \
-timeout 60m \
-bench={{bench}} \
-count={{count}} \
-p {{p}} \
-test.benchtime {{benchtime}} \
{{tmpdir}}/test/performance | tee -a ./report/benchmark-output-main.txt
- benchstat ./report/benchmark-output-main.txt ./report/benchmark-output-local.txt > ./report/benchmark-comparison.txt || true
+ if ! benchstat ./report/benchmark-output-main.txt ./report/benchmark-output-local.txt > ./report/benchmark-comparison.txt; then
+ echo "Warning: benchstat failed to generate comparison" >&2
+ fi
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
git clone --depth 1 -b main https://github.com/formancehq/ledger {{tmpdir}} | |
go test -run ^$ -tags it,local -report.file \ | |
./report/report.json \ | |
-timeout 60m \ | |
-bench={{bench}} \ | |
-count={{count}} \ | |
-p {{p}} \ | |
-test.benchtime {{benchtime}} \ | |
{{tmpdir}}/test/performance | tee -a ./report/benchmark-output-main.txt | |
benchstat ./report/benchmark-output-main.txt ./report/benchmark-output-local.txt > ./report/benchmark-comparison.txt || true | |
git clone --depth 1 -b main https://github.com/formancehq/ledger {{tmpdir}} || exit 1 | |
go test -run ^$ -tags it,local -report.file \ | |
./report/report.json \ | |
-timeout 60m \ | |
-bench={{bench}} \ | |
-count={{count}} \ | |
-p {{p}} \ | |
-test.benchtime {{benchtime}} \ | |
{{tmpdir}}/test/performance | tee -a ./report/benchmark-output-main.txt | |
if ! benchstat ./report/benchmark-output-main.txt ./report/benchmark-output-local.txt > ./report/benchmark-comparison.txt; then | |
echo "Warning: benchstat failed to generate comparison" >&2 | |
fi |
d8399a9
to
aa56877
Compare
aa56877
to
5108107
Compare
5bf5bf4
to
3c9b0cc
Compare
3c9b0cc
to
7d5aefe
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #654 +/- ##
==========================================
+ Coverage 81.62% 81.66% +0.04%
==========================================
Files 131 131
Lines 7059 7059
==========================================
+ Hits 5762 5765 +3
+ Misses 994 992 -2
+ Partials 303 302 -1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
test/performance/env_testserver_test.go (1)
Line range hint
50-93
: Overall change improves test infrastructure maintainability.The transition from lazy to eager initialization and the removal of
sync.Once
aligns well with the PR's objective of simplifying the benchmark infrastructure. The new implementation maintains proper test isolation while reducing complexity.Consider documenting these testing patterns in the README to help other contributors understand the resource management approach.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
.github/workflows/benchmark.yml
is excluded by!**/*.yml
📒 Files selected for processing (2)
test/performance/env_testserver_test.go
(1 hunks)test/performance/justfile
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- test/performance/justfile
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: BenchmarkCompare
- GitHub Check: Tests
🔇 Additional comments (2)
test/performance/env_testserver_test.go (2)
Line range hint
50-62
: LGTM: Configuration is well-suited for performance testing.The Docker pool and database configuration are appropriately set up with:
- Testing logger for visibility
- High connection limits (100) suitable for benchmarking
- Standard PostgreSQL setup with PGCrypto extension
50-50
: Consider the implications of eager Docker pool initialization.The change from lazy to eager initialization of
dockerPool
has both benefits and potential drawbacks:
- ✅ Simpler code with direct initialization
- ✅ Each test gets a fresh pool instance, improving isolation
⚠️ May impact performance if Create is called frequentlyLet's verify the usage pattern to ensure this change aligns with the testing needs:
✅ Verification successful
Eager Docker pool initialization is appropriate for the benchmark usage pattern ✅
The current implementation correctly initializes the Docker pool once before parallel benchmark execution, providing clean test isolation without performance overhead.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check how Create method is used in tests # Look for patterns that might call Create multiple times # Search for direct Create calls rg -A 3 "Create\(ctx, b, " test/performance/ # Look for test functions that might create multiple environments ast-grep --pattern 'func Test$_($_, $b *testing.B) { $$$ Create($$$) $$$ }'Length of output: 400
Script:
#!/bin/bash # Check for benchmark loops and parallel execution patterns rg -B 5 -A 10 "Create\(ctx, b, " test/performance/benchmark_test.go # Look for any cleanup or teardown code ast-grep --pattern 'func ($_, $b *testing.B) { $$$ defer $$$ $$$ }'Length of output: 554
|
||
jobs: | ||
BenchmarkCompare: | ||
runs-on: "github-001" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only on github-001 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No specific reason, i just copy/paste from the original workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess comparison will be more accutrate if benchmarks are ran in the same machines?
--allow-privileged | ||
${{ contains(github.event.pull_request.labels.*.name, 'no-cache') && '--no-cache' || '' }} | ||
./test/performance+generate-graphs | ||
- run: kill -9 $(ps aux | grep "ledger serve"| grep -v "grep" | awk '{print $2}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there isn't this part, and the benchmarks go wrong for X or Y reason, the Ledger will continue to work and won't be cut.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ledger is run via the test, not externally
graphs: | ||
cd charts && npm install | ||
cd charts && npm run build | ||
cd charts && node ./index.js |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
./index.ts ? Because index.js is removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after the npm run build above, it exists
No description provided.