Skip to content

【BUG】the metric starrocks_be_process_mem_bytes is not precise, maybe casue query Memory of process exceed limit #67222

@kwenzh

Description

@kwenzh

Steps to reproduce the behavior (Required)

  1. Construct a large dataset, approximately 300GB,
  2. build concurrent queries.
  3. watching system resource info
  4. check metrics starrocks_be_process_mem_bytes
  5. get pod memory usage from /sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcd547e58_60f9_4459_a8bb_f0f46043028b.slice/docker-876005925a5ba96dc7e235e0497b7ef832e2befa7f6d60da24aeca141b489df8.scope/memory.usage_in_bytes

Expected behavior (Required)

starrocks_be_process_mem_bytes is near pod mem usage,

Real behavior (Required)

starrocks_be_process_mem_bytes is 48G, but pod memory usage is 35GB,

StarRocks version (Required)

  • 4.0.1

mark

and we seem sql query ERROR: Error 1064 (HY000): Memory of process exceed limit. Pipeline Backend: starrocks-be-7.starrocks-be-search.olap.svc.cluster.local, fragment: 019b53bc-1e8b-7095-b66f-53c34122286f Used: 69578920144, Limit: 69578470195. Mem usage has exceed the limit of BE: BE:10343"

so , I investigated the error return here and found that the memory check comparison and metrics came from the same source. Given the inconsistency between the metrics and actual memory usage, I have questions about whether the calculations were inaccurate, leading to incorrect monitoring metrics and incorrectly aborted queries.

  std::atomic<int64_t> current_value_;
        static const int64_t MAX_INT64 = 9223372036854775807ll;
Image

Metadata

Metadata

Assignees

Labels

type/bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions