Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud: Binary-based ingestion #2954

Closed
codebien opened this issue Mar 6, 2023 · 1 comment
Closed

cloud: Binary-based ingestion #2954

codebien opened this issue Mar 6, 2023 · 1 comment

Comments

@codebien
Copy link
Contributor

codebien commented Mar 6, 2023

What

The current Cloud ingestion service receives the metrics on the CLOUD_URL/v1/metrics/<TEST_REF_ID> endpoint. Each HTTP request contains a JSON payload with an array of Sample at the root and each Sample contains a data field where the type is one of the types defined below.

We want replace it implementing a new HTTP body format using a binary encoding.

Current JSON payload

[
  {
    "type": "<TYPE>",
    "metric": "<NAME>",
    "data": {
      ...
    },
  },{
  ...
  }
]

Single point

Show me the JSON
{
"type": "Point",
"metric": "vus",
"data": {
  "time": "%d",
  "type": "gauge",
  "tags": {
    "aaa": "bbb",
    "ccc": "123"
  },
  "value": 999
}
}

Multi points

Show me the JSON
{
"type": "Points",
"metric": "iter_li_all",
"data": {
  "time": "%d",
  "type": "counter",
  "tags": {
    "test": "mest"
  },
  "values": {
    "data_received": 6789.1,
    "data_sent": 1234.5,
    "iteration_duration": 10000
  }
}
}

Aggregated points

Show me the JSON
{
"type": "AggregatedPoints",
"metric": "http_req_li_all",
"data": {
  "time": "%d",
  "type": "aggregated_trend",
  "count": 2,
  "tags": {
    "test": "mest"
  },
  "values": {
    "http_req_duration": {
      "min": 0.013,
      "max": 0.123,
      "avg": 0.068
    },
    "http_req_blocked": {
      "min": 0.001,
      "max": 0.003,
      "avg": 0.002
    },
    "http_req_connecting": {
      "min": 0.001,
      "max": 0.002,
      "avg": 0.0015
    },
    "http_req_tls_handshaking": {
      "min": 0.003,
      "max": 0.004,
      "avg": 0.0035
    },
    "http_req_sending": {
      "min": 0.004,
      "max": 0.005,
      "avg": 0.0045
    },
    "http_req_waiting": {
      "min": 0.005,
      "max": 0.008,
      "avg": 0.0065
    },
    "http_req_receiving": {
      "min": 0.006,
      "max": 0.008,
      "avg": 0.007
    }
  }
}
}

Why

It is required to have better efficiency at scale. An encoding binary format would reduce the size of the payload and the hardware requirements for encoding/decoding operations both on cloud and on clients.

Non-Goals

  • Aggregation algorithm for reducing the dataset of flushed data (e.g. aggregation).

How / Proposals

Create a new Cloud output (v2) that flushes metrics creating HTTP requests using the Protobuf mechanism for serializing the body.

In summary an example of the HTTP request:

POST CLOUD_URL/v2/metrics/<TEST_REF_ID> HTTP/1.1
Host: www.example.com
User-Agent: k6
Content-Type: application/x-protobuf
Content-Encoding: snappy
K6-Metrics-Protocol-Version: 2.0

To stay closer to the Prometheus implementation, the output has to compresss using the Snappy algorithm.

The code below contains a Protobuf proposal inspired by OpenMetrics to use for encoding the body request:

Show me the Proto file

EDIT: The Protobuf after several iterations, https://github.com/grafana/k6/blob/0cddc417243fd152f0a2e532b1870fa6d8635d03/output/cloud/expv2/pbcloud/metric.proto

TODO: Use a HDR histogram implementation for mapping the Trend type.

It is a requiment to add the name and the test run id as part of the tag set. The output has to add:

metrics.<metric>.tags["__name__"] = "<metric-name>"
metrics.<metric>.tags["test_run_id"] = "<test-ref-id>"

Additional implementation details

Encapsulate the new Cloud output startup from the current Cloud output based on a config option logic. In this way, we can overwrite at runtime the used output and fallback on the previous logic in case it is required.

Action Plan

  1. Quick and dirty implementation of a basic Cloud output v2
    • Config option for using v2 and the relative logic in v1
    • Ability to flush metric samples encoded as defined by the new protocol
    • No Trend implementation
  2. Trend implementation as HDR
  3. Reiterate for polish and stability

Future

  • Better metrics aggregation (e.g. Cloud Aggregation for Counter, Gauge and Rate #1700)
  • Consider a full Prometheus Remote-write implementation
  • Origin (e.g Builtin or Custom) of the metrics (at the moment the cloud backend has a fixed list of the Builtin metrics).

Open Questions

  • HDR format in protobuf Done

Work log

It collects all the tasks required for the new cloud output. The new cloud outputs include a consistent refactor, a new binary format for the metrics requests' payload and samples aggregation and HDR Histogram generation on the client.

It depends on the following PRs as a prerequisite:

The following PRs are expected to be merged to have the final working output:

@codebien
Copy link
Contributor Author

codebien commented Jun 8, 2023

Most of the work planned here has been merged, I will close it and continue on a new dedicated issue #3117 for the remaining performance optimizations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant