Skip to content

Using external metrics from the Kubernetes API server #1235

@jonnylangefeld

Description

@jonnylangefeld

Describe the feature

We are currently using flagger with Datadog metrics. Even though we don't have a very high amount of Canary resources with too frequent Datadog requests (we calculated 1860 requests per hour), flagger hits the Datadog rate limit:

Warning  Synced  18m (x11 over 8h)      flagger  Metric query failed for sli: error response: {"errors": ["Rate limit of 6000 requests in 3600 seconds reached. Please try again later."]}: %!w(<nil>)

One would assume that Datadog's officially supported Kubernetes external metrics feature would have the same issue, but the secret sauce there is batched metrics requests as described here.

Proposed solution

Kubernetes exposed external metrics via API endpoints, such as this:

╰─ kubectl get --raw  "/apis/external.metrics.k8s.io/v1beta1/namespaces/ingress/gcp.pubsub.subscription.num_undelivered_messages" | jq
{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "metricName": "gcp.pubsub.subscription.num_undelivered_messages",
      "metricLabels": {
        "project_id": "123",
        "subscription_id": "abc"
      },
      "timestamp": "2022-07-14T20:24:26Z",
      "value": "0"
    }
  ]
}

Would it be possible to let flagger query these endpoints the same way a HorizontalPodAutoscaler would do it?

Is there another way to solve this problem that isn't as good a solution?

  • Implement our own proxy that does batch queries just like the Datadog Cluster Agent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions