Significantly high memory usage on 0.41.0? #2762

19shubham11 · 2022-11-09T15:09:09Z

Brief summary

We've been using k6 for a couple of months and were able to develop a test-suite that gave us around 200k rps with the use of 5 worker machines running on GCP each running the given suite. We use a ramping-vus iterator and this how our config looks like

        executor: 'ramping-vus',
        gracefulStop: '1m',
        startVUs: 0,
        stages: [
            { duration: '5m', target: 3000 },
            { duration: '5m', target: 5000 },
            { duration: '5m', target: 8000 },
            { duration: '5m', target: 10000 },
            { duration: '5m', target: 12000 },
            { duration: '5m', target: 15000 },
            { duration: '10m', target: 15000 },
        ],
        gracefulRampDown: '5m',
    },

the tests generally run for about 45 mins and reach max VUs of 15000.

Until yesterday we were running on k6 version 0.40.0, and upon updating to 0.41.0, memory usage went up really high. We have a max memory limit of 100GB on the instance and the 0.41.0 reached 85% memory usage within 10 mins of the test executing, I reverted back to 0.40.0 and the memory usage was down to ~10% for the entirety of the 45mins.

Is this a known issue, or related to something introduced in the newer version, or maybe something got deprcated and I need to adjust the setup somehow?
Happy to provide more details if needed

k6 version

0.41.0

OS

Debian GNU/Linux 11 (bullseye)

Docker version and image (if applicable)

No response

Steps to reproduce the problem

running a high-scale test (15k) VUs on 0.41.0 and on 0.40.0

Expected behaviour

significantly high memory usage on 0.41.0

Actual behaviour

should perform the same?

The text was updated successfully, but these errors were encountered:

na-- · 2022-11-09T15:15:58Z

Can you share something about what your script actually does? Does it generate a lot of metrics with unique/high-cardinality tags? For example, do you have a ton of unique URLs or something like that?

If so, the high memory usage might be because of this change in k6 v0.41.0 and you may be able to ameliorate it by using URL grouping: https://k6.io/docs/using-k6/http-requests/#url-grouping

If not, then please share any other details about your script to help us diagnose what the issue might be.

19shubham11 · 2022-11-09T15:26:28Z

Oh interesting, yes, the script actually has around 12 unique URLs and and a few with a path param that changes based on the previous response, they are mostly CRUD operations, and are called in sequence over again (as you can see maxes out at 15k VUs)

So if I understand correctly based on https://k6.io/docs/using-k6/http-requests/#url-grouping that it will be generating unique metrics per URL? (like users/1 and users/2 would be treated differently?)

I'm assuming a URL like /users/{:id} called 10k times will create 10k new metrics in the newer version? Anyway way to disable this?

na-- · 2022-11-09T15:33:43Z

So if I understand correctly based on https://k6.io/docs/using-k6/http-requests/#url-grouping that it will be generating unique metrics per URL? (like users/1 and users/2 would be treated differently?)

I'm assuming a URL like /users/{:id} called 10k times will create 10k new metrics in the newer version? Anyway way to disable this?

Yes. Or, rather, it will create 10k (or more, if you have other differences in tags) time series.

This is probably the problem. Try to use the http.url helper or (manually set the name tag) for these requests and you should see your memory usage significantly decrease. Memory usage (with a reasonable number of time series) and the garbage collection CPU overhead should actually be lower than v0.40.0 🤞

19shubham11 · 2022-11-09T15:46:57Z

Alright, thanks! I'll try and add tag to all of the paths and report back.

On an unrelated note, is it expected to have minor-ish "breaking" changes on normal releases; we just download the latest version so were just unaware of the 0.41 release until today.

(Haven't really read the full release policy myself so feel free to ignore, plus it's not a breaking change anyway just a performance dip IMO, anyway it's a great tool have loved using it so far)

19shubham11 · 2022-11-10T07:50:07Z

Added tags to the params and can confirm memory usage is not shooting up, thanks

na-- · 2022-11-10T08:01:17Z

Awesome 🎉 Can you provide a rough estimate of how many unique URLs your script was hitting? Because even 10k-15k unique URLs (and so, time series) shouldn't have caused such a huge increase in memory usage according to our tests?

19shubham11 · 2022-11-10T11:01:48Z

10-15k was just an example I gave 😅 so for some actual numbers - endpoints with path params (unique URLs) would be called around 10k/sec -> so ~10000 * 60 * 45 = ~27M for a 45 min full test that we run. But since on 0.41.0 we almost went OOM after around 10 mins, that would be ~10000 * 60 * 10 = ~6M. So I'm assuming 6M unique time series and they kept adding up

na-- · 2022-11-10T11:11:59Z

Ah, yeah, that would certainly do it 😅

Now that we can actually track the amount of unique time series, we will probably add some sort of a warning if some number is exceeded, e.g. 100k? 🤔 We'll need to do some benchmarking

19shubham11 · 2022-11-10T11:17:28Z

Yeah I think that would be great, but logs might be hard to follow sometimes, but would be something.(but it's also not easy to hit those numbers on a local setup with limited CPU/memory from what I learnt)

Is there a possibility to disable these time series metrics itself (on something planned in the future?) because I am not really using these too extensively, and we just rely on the prometheus metrics on the server side to validate our results and not on the loadtest client.

na-- · 2022-11-10T11:38:54Z

Is there a possibility to disable these time series metrics itself (on something planned in the future?)

Unfortunately you can't disable them and we probably won't add such a feature in the future, sorry 😞

It's not ideal and it is a problem for some existing tests like yours, but on the other hand a whole bunch of core things that now work on top of the time series functionality are (or can be) way more efficient than before, and we also need time series for certain other feature to be possible to implement at all:

the current outputs have been and can be optimized with them and new outputs (e.g. Prometheus) basically require them
current sub-metric thresholds and future threshold improvements
JS APIs to control metrics (e.g. Add explicit tracking and ignoring of metrics and sub-metrics #1321)

And yeah, unfortunately, if there millions of unique URLs in your test, you'd need to adjust your script slightly and add the name tag to group them, but it's a viable workaround. You needed to do that URL grouping with name even before, if you wanted to export your metrics to the k6 Cloud or InfluxDB, or basically any other output besides csv and json. Or if you needed to set thresholds on the metrics from these requests. With the http.url helper it's not even that big of an overhead, it's just a template literal with a few extra characters. It sucks, but most other tools that deal with metrics also have cardinality restrictions precisely for similar reasons we now need them... 😞

And for non-URL unique tags, we intend to have a JS API to support high-cardinality metric metadata in the future, i.e. basically something like tags that doesn't result in new time series being created for different values. Right now that part is only internal (i.e. usable from Go code in the core and xk6 extensions).

19shubham11 · 2022-11-10T12:41:19Z

yeah totally makes sense, thanks for the clarification and the quick help on this issue as well :)

na-- · 2022-11-10T13:11:10Z

I'll close this issue since I opened grafana/k6-docs#883, #2765 and #2766 for various parts of the things we touched here 😅

19shubham11 added the bug label Nov 9, 2022

na-- added high prio evaluation needed proposal needs to be validated or tested before fully implementing it in k6 labels Nov 9, 2022

na-- added the performance label Nov 9, 2022

na-- mentioned this issue Nov 10, 2022

Add warning if there are too many time series #2765

Closed

na-- mentioned this issue Nov 10, 2022

JS API for user-specified high-cardinality metrics metadata #2766

Closed

na-- closed this as completed Nov 10, 2022

na-- removed bug high prio evaluation needed proposal needs to be validated or tested before fully implementing it in k6 labels Nov 11, 2022

na-- mentioned this issue Jan 3, 2023

Exponential memory leaks in two latest versions #2836

Closed

This was referenced Jan 25, 2023

Ever-increasing memory consumption when HTTP URLs have random identifiers #2876

Closed

Add better information about time series and how to avoid generating too many grafana/k6-docs#883

Open

codebien mentioned this issue Aug 14, 2023

High memory usage #3269

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significantly high memory usage on 0.41.0? #2762

Significantly high memory usage on 0.41.0? #2762

19shubham11 commented Nov 9, 2022

na-- commented Nov 9, 2022

19shubham11 commented Nov 9, 2022

na-- commented Nov 9, 2022 •

edited

Loading

19shubham11 commented Nov 9, 2022 •

edited

Loading

19shubham11 commented Nov 10, 2022

na-- commented Nov 10, 2022 •

edited

Loading

19shubham11 commented Nov 10, 2022 •

edited

Loading

na-- commented Nov 10, 2022

19shubham11 commented Nov 10, 2022 •

edited

Loading

na-- commented Nov 10, 2022

19shubham11 commented Nov 10, 2022

na-- commented Nov 10, 2022

Significantly high memory usage on 0.41.0? #2762

Significantly high memory usage on 0.41.0? #2762

Comments

19shubham11 commented Nov 9, 2022

Brief summary

k6 version

OS

Docker version and image (if applicable)

Steps to reproduce the problem

Expected behaviour

Actual behaviour

na-- commented Nov 9, 2022

19shubham11 commented Nov 9, 2022

na-- commented Nov 9, 2022 • edited Loading

19shubham11 commented Nov 9, 2022 • edited Loading

19shubham11 commented Nov 10, 2022

na-- commented Nov 10, 2022 • edited Loading

19shubham11 commented Nov 10, 2022 • edited Loading

na-- commented Nov 10, 2022

19shubham11 commented Nov 10, 2022 • edited Loading

na-- commented Nov 10, 2022

19shubham11 commented Nov 10, 2022

na-- commented Nov 10, 2022

na-- commented Nov 9, 2022 •

edited

Loading

19shubham11 commented Nov 9, 2022 •

edited

Loading

na-- commented Nov 10, 2022 •

edited

Loading

19shubham11 commented Nov 10, 2022 •

edited

Loading

19shubham11 commented Nov 10, 2022 •

edited

Loading