Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional receiver or scrape in V2 version. #955

Closed
cxk314 opened this issue Nov 26, 2024 · 9 comments
Closed

Additional receiver or scrape in V2 version. #955

cxk314 opened this issue Nov 26, 2024 · 9 comments

Comments

@cxk314
Copy link

cxk314 commented Nov 26, 2024

Hi,
I am trying to add Faro receiver using k8s-monitoring-helm V2 chart. What is the recommended way to do this? Where should it go? I tried following to add it in alloy-receiver block of values.yaml but not sure it is the right way:

# An Alloy instance for opening receivers to collect application data.
alloy-receiver:
  # -- Deploy the Alloy instance for opening receivers to collect application data.
  # @section -- Collectors - Alloy Receiver
  enabled: true
  controller:
    podAnnotations: {kubernetes.azure.com/set-kube-service-host-fqdn: "true"}
  extraConfig: |-
      faro.receiver "integrations_app_agent_receiver" {
        server {
          listen_address           = "0.0.0.0"
          listen_port              = 8027
          cors_allowed_origins     = ["*"]
          max_allowed_payload_size = "10MiB"

          rate_limiting {
            rate = 100
          }
        }

        output {
          logs   = [otelcol.receiver.loki.otlp.receiver]
          traces = [ otelcol.processor.transform.otlp.input]
        }
      }
  alloy:
    extraPorts:
      - name: otlp-grpc
        port: 4317
        targetPort: 4317
        protocol: TCP

And for additional scrape where to add that? I need to get metrics from Azure Monitor for Azure EventHub. Should I add it in alloy-singleton like this?

alloy-singleton:
  enabled: true
  controller:
    podAnnotations: {kubernetes.azure.com/set-kube-service-host-fqdn: "true"}
  extraConfig: |-
    prometheus.scrape "eventhub_azure_exporter" {
      targets    = prometheus.exporter.azure.eventhub_azure_exporter.targets
      forward_to = [otelcol.receiver.prometheus.otlp.receiver]
      job_name   = "integrations/azure_exporter"
    }
    prometheus.exporter.azure "eventhub_azure_exporter" {
      subscriptions = ["my_subscription_id"]
      resource_type = "microsoft.eventhub/namespaces"
      metrics = ["ServerErrors", "UserErrors", "QuotaExceededErrors", "ThrottledRequests", "IncomingMessages"] 
    }

More examples of how to customize V2 Helm chart would be really helpful. Like adding more receivers, scrape or discovery rules.

@petewall
Copy link
Collaborator

petewall commented Nov 26, 2024

So, you're pretty close. Adding the faro receiver works in the alloy-receivers' extraConfig section. You should add the 8027 port to the list of extraPorts, too.

As for where to point the output of that component, the easiest would be to enable the applicationObservability feature and utilize those components:

applicationObservability:
  enabled: true
  receivers:
    otlp:
      grpc:
        enabled: true
...
alloy-receiver:
  enabled: true
  controller:
    podAnnotations: {kubernetes.azure.com/set-kube-service-host-fqdn: "true"}
  extraConfig: |-
    faro.receiver "integrations_app_agent_receiver" {
      server {
        listen_address           = "0.0.0.0"
        listen_port              = 8027
        cors_allowed_origins     = ["*"]
        max_allowed_payload_size = "10MiB"
        
        rate_limiting {
          rate = 100
        }
      }
      
      output {
        logs = [otelcol.processor.k8sattributes.default.input]
        traces = [otelcol.processor.k8sattributes.default.input]
      }
    }

  alloy:
    extraPorts:
      - name: otlp-grpc
        port: 4317
        targetPort: 4317
        protocol: TCP
      - name: faro
        port: 8027
        targetPort: 8027
        protocol: TCP

As for your Azure metrics, putting it in the singleton makes sense since the prometheus.exporter.azure component doesn't support clustering. However, I don't believe that chart v2 has a way to force the components for a destination into an alloy config unless there's a feature that uses it. I'll open a sub-issue to track that. Fortunately, the built-in self reporting metric will load a metrics destination on alloy-singleton.

@petewall
Copy link
Collaborator

Here's a full values file. You'll need to merge in any other changes and destinations. Replace the "my-destination" with the name of your destination that can handle metrics:

cluster:
  name: cxk314-cluster

destinations:
  - name: my-destination
    type: otlp
    host: otlp-gateway.example.com
    metrics: {enabled: true}
    logs: {enabled: true}
    traces: {enabled: true}

applicationObservability:
  enabled: true
  receivers:
    otlp:
      grpc:
        enabled: true

alloy-receiver:
  enabled: true
  controller:
    podAnnotations: {kubernetes.azure.com/set-kube-service-host-fqdn: "true"}
  extraConfig: |-
    faro.receiver "integrations_app_agent_receiver" {
      server {
        listen_address           = "0.0.0.0"
        listen_port              = 8027
        cors_allowed_origins     = ["*"]
        max_allowed_payload_size = "10MiB"
        
        rate_limiting {
          rate = 100
        }
      }
      
      output {
        logs = [otelcol.processor.k8sattributes.default.input]
        traces = [otelcol.processor.k8sattributes.default.input]
      }
    }

  alloy:
    extraPorts:
      - name: otlp-grpc
        port: 4317
        targetPort: 4317
        protocol: TCP
      - name: faro
        port: 8027
        targetPort: 8027
        protocol: TCP

alloy-singleton:
  enabled: true
  controller:
    podAnnotations: {kubernetes.azure.com/set-kube-service-host-fqdn: "true"}
  extraConfig: |-
    prometheus.exporter.azure "eventhub_azure_exporter" {
      subscriptions = ["my_subscription_id"]
      resource_type = "microsoft.eventhub/namespaces"
      metrics = ["ServerErrors", "UserErrors", "QuotaExceededErrors", "ThrottledRequests", "IncomingMessages"] 
    }
    prometheus.scrape "eventhub_azure_exporter" {
      targets    = prometheus.exporter.azure.eventhub_azure_exporter.targets
      job_name   = "integrations/azure_exporter"
      forward_to = [otelcol.receiver.prometheus.my-destination.receiver]
    }

@cxk314
Copy link
Author

cxk314 commented Nov 27, 2024

Thanks @petewall. Not sure if I am getting what you said with this statement:
However, I don't believe that chart v2 has a way to force the components for a destination into an alloy config unless there's a feature that uses it.
Do you mean that additional components in extraConfig for alloy-singleton will not be able to send telemetry to destination specified in destinations: block?

Also, regarding Fortunately, the built-in self-reporting metric will load a metrics destination on alloy-singleton.. I am actually disabling self-reporting because of certificate error I am getting from stats.grafana. Does it mean that prometheus.exporter.azure component will not be able to send telemetry to the destinations?

@petewall
Copy link
Collaborator

So, what I meant by my comment is that the config generator tries to only add destination components when there are features enabled that will use those destinations. No sense in putting components for a pyroscope destination on any alloy that isn't handling profiles, for example.

Unfortunately, this intelligent placement doesn't work if the only thing going onto an alloy instance is the extraConfig. Self-reporting would go to alloy-singleton by default, but if that's disabled, then it's possible that no metric destination being set for that config.

@petewall
Copy link
Collaborator

Im curious about the certificate error you're seeing. The Self-reporting feature merely creates a small set of static metrics and tries to deliver them to the same location as any metric destination. It does not go directly to Grafana.

@petewall
Copy link
Collaborator

Oh, are you referring to:

alloy-logs:
  alloy:
    enableReporting: true|false

Yeah, you can disable that safely.

I'm referring to:

selfReporting:
  enabled: true|false

@cxk314
Copy link
Author

cxk314 commented Nov 27, 2024

Im curious about the certificate error you're seeing. The Self-reporting feature merely creates a small set of static metrics and tries to deliver them to the same location as any metric destination. It does not go directly to Grafana.

We are getting the below for 2.0.0-rc.5 and 2.0.0-rc.6

ts=2024-11-27T14:27:48.649305894Z level=info msg="failed to report usage" err="5 errors: Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority; Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority; Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority; Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority; Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
--
s=2024-11-27T23:08:05.882012631Z level=info msg="reporting Alloy stats" date=2024-11-27T23:08:05.882Z
ts=2024-11-27T23:08:06.030783312Z level=info msg="failed to send usage report" retries=0 err="Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
ts=2024-11-27T23:08:07.766783265Z level=info msg="failed to send usage report" retries=1 err="Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
ts=2024-11-27T23:08:10.80118174Z level=info msg="failed to send usage report" retries=2 err="Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
ts=2024-11-27T23:08:17.65680035Z level=info msg="failed to send usage report" retries=3 err="Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
ts=2024-11-27T23:08:26.259442026Z level=info msg="failed to send usage report" retries=4 err="Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
ts=2024-11-27T23:08:26.259504918Z level=info msg="failed to report usage" err="5 errors: Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority; Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority; Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority; Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority; Post \"https://stats.grafana.org/alloy-usage-report\": tls: failed to verify certificate: x509: certificate signed by unknown authority"

When:

selfReporting:
  enabled: true

then all Alloy instances are having that certificate error. We have to set it to false but then we are not going to get Azure components sending us telemetry? How can we force it? Put in different Alloy?

We are also getting this error. Is alloy-logs/otel destination not converting logs to grpc? We have one destination defined for all telemetry. Our endpoint expects grcp for everything:

destinations:
  - name: otlp
    type: otlp
    url: https://np-grpc.np-shared.com
podLogs:
  enabled: true
alloy-logs:
  enabled: true
  controller:
    podAnnotations: {kubernetes.azure.com/set-kube-service-host-fqdn: "true"}

Metrics are working fine but logs have this error:

ts=2024-11-27T19:55:35.505605326Z level=info msg="Exporting failed. Will retry the request after interval." component_path=/ component_id=otelcol.exporter.otlp.otlp error="rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 503 (Service Unavailable); transport: received unexpected content-type \"text/html\"" interval=4.409419831s

@petewall
Copy link
Collaborator

Hmm.. is there anything interesting with your cluster setup that would prevent the certificates? Perhaps an AKS thing?

In the meantime, just add this to all of your alloy instances:

alloy-logs:
  alloy:
    enableReporting: false

That should silence the certificate errors.

As for:

selfReporting:
  enabled: true

That should only be injecting a small handful of metrics. It does not deliver those metrics anywhere other than metric-capable destinations you define.

@cxk314 cxk314 closed this as completed Jan 2, 2025
@cxk314
Copy link
Author

cxk314 commented Jan 2, 2025

Setting reporting flag to false globally as well as on each Alloy stopped reporting:

selfReporting:
  enabled: false

This should be for every Alloy:

alloy-metrics:
  enabled: true
  alloy:
    enableReporting: false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants