Compare commits

..

5 Commits

Author SHA1 Message Date
Mikkel Oscar Lyderik Larsen a72380125f Update SECURITY.md (#84)
Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
2019-10-22 16:11:29 +02:00
aermakov-zalando 70c7fb843d Merge pull request #83 from zalando-incubator/ingress-collector
Skipper: simplify metrics collection
2019-10-22 16:07:49 +02:00
Alexey Ermakov 79533a5a93 Skipper: simplify metrics collection
* Drop MaxWeightedCollector (we don't want max anyway, we want sum)
 * Use Prometheus to add up all matching metrics and scale them; this
   has a nice side effect of ensuring that unused hostnames don't cause
   an error when collecting the metrics
 * Update the tests a bit

Signed-off-by: Alexey Ermakov <alexey.ermakov@zalando.de>
2019-10-21 14:05:30 +02:00
Mikkel Oscar Lyderik Larsen 2765ff9811 Merge pull request #68 from zalando-incubator/skipper-collector-averagevalue
Add support for averageValue for request-per-second Skipper metric
2019-10-10 08:09:30 +02:00
Mikkel Oscar Lyderik Larsen 76d2f74743 Add support for averageValue for request-per-second Skipper metric
This adds support for `averageValue` for the `request-per-second` metric
based on Ingress Objects. This is only supported from Kubernetes
`>=v1.14` (https://github.com/kubernetes/kubernetes/pull/72872).

When defining the HPA with `autoscaling/v2beta1` you still need to
define `targetValue` even though it won't be used when `averageValue` is
set. Once we default to `autoscaling/v2beta2` this akward API will be
gone.

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
2019-10-08 17:10:28 +02:00
9 changed files with 234 additions and 283 deletions
+40 -18
View File
@@ -1,10 +1,13 @@
# kube-metrics-adapter
[![Build Status](https://travis-ci.org/zalando-incubator/kube-metrics-adapter.svg?branch=master)](https://travis-ci.org/zalando-incubator/kube-metrics-adapter)
[![Coverage Status](https://coveralls.io/repos/github/zalando-incubator/kube-metrics-adapter/badge.svg?branch=master)](https://coveralls.io/github/zalando-incubator/kube-metrics-adapter?branch=master)
Kube Metrics Adapter is a general purpose metrics adapter for Kubernetes that
can collect and serve custom and external metrics for Horizontal Pod
Autoscaling.
It supports scaling based on [Prometheus metrics](https://prometheus.io/), [SQS queues](https://aws.amazon.com/sqs/) and others out of the box.
It discovers Horizontal Pod Autoscaling resources and starts to collect the
requested metrics and stores them in memory. It's implemented using the
[custom-metrics-apiserver](https://github.com/kubernetes-incubator/custom-metrics-apiserver)
@@ -41,6 +44,18 @@ The `metric-config.*` annotations are used by the `kube-metrics-adapter` to
configure a collector for getting the metrics. In the above example it
configures a *json-path pod collector*.
## Kubernetes compatibility
Like the [support
policy](https://kubernetes.io/docs/setup/release/version-skew-policy/) offered
for Kubernetes, this project aims to support the latest three minor releases of
Kubernetes.
Currently the default supported API is `autoscaling/v2beta1`. However we aim to
move to `autoscaling/v2beta2` (available since `v1.12`) in the near future as
this adds a lot of improvements over `v2beta1`. The move to `v2beta2` will most
likely happen as soon as [GKE adds support for it](https://issuetracker.google.com/issues/135624588).
## Building
This project uses [Go modules](https://github.com/golang/go/wiki/Modules) as
@@ -71,9 +86,9 @@ Currently only `json-path` collection is supported.
### Supported metrics
| Metric | Description | Type |
| ------------ | -------------- | ------- |
| *custom* | No predefined metrics. Metrics are generated from user defined queries. | Pods |
| Metric | Description | Type | K8s Versions |
| ------------ | -------------- | ------- | -- |
| *custom* | No predefined metrics. Metrics are generated from user defined queries. | Pods | `>=1.10` |
### Example
@@ -151,10 +166,10 @@ the trade-offs between the two approaches.
### Supported metrics
| Metric | Description | Type | Kind |
| ------------ | -------------- | ------- | -- |
| `prometheus-query` | Generic metric which requires a user defined query. | External | |
| *custom* | No predefined metrics. Metrics are generated from user defined queries. | Object | *any* |
| Metric | Description | Type | Kind | K8s Versions |
| ------------ | -------------- | ------- | -- | -- |
| `prometheus-query` | Generic metric which requires a user defined query. | External | | `>=1.10` |
| *custom* | No predefined metrics. Metrics are generated from user defined queries. | Object | *any* | `>=1.10` |
### Example: External Metric
@@ -259,9 +274,9 @@ box so users don't have to define those manually.
### Supported metrics
| Metric | Description | Type | Kind |
| ----------- | -------------- | ------ | ---- |
| `requests-per-second` | Scale based on requests per second for a certain ingress. | Object | `Ingress` |
| Metric | Description | Type | Kind | K8s Versions |
| ----------- | -------------- | ------ | ---- | ---- |
| `requests-per-second` | Scale based on requests per second for a certain ingress. | Object | `Ingress` | `>=1.14` (can work with `>=1.10`) |
### Example
@@ -288,7 +303,10 @@ spec:
apiVersion: extensions/v1beta1
kind: Ingress
name: myapp
targetValue: 10 # this will be treated as targetAverageValue
averageValue: 10 # Only works with Kubernetes >=1.14
# for Kubernetes <1.14 you can use `targetValue` instead:
targetValue: 10 # this must be set, but has no effect if `averageValue` is defined.
# Otherwise it will be treated as targetAverageValue
```
### Metric weighting based on backend
@@ -302,13 +320,17 @@ return the requests-per-second being sent to the `backend1`. The ingress annotat
the backend weights can be obtained can be specified through the flag `--skipper-backends-annotation`.
**Note:** As of Kubernetes v1.10 the HPA does not support `targetAverageValue` for
**Note:** For Kubernetes `<v1.14` the HPA does not support `averageValue` for
metrics of type `Object`. In case of requests per second it does not make sense
to scale on a summed value because you can not make the total requests per
second go down by adding more pods. For this reason the skipper collector will
automatically treat the value you define in `targetValue` as an average per pod
instead of a total sum.
**ONLY use `targetValue` if you are on Kubernetes
`<1.14`, it is not as percise as using `averageValue` and will not be supported
after Kubernetes `v1.16` is released according to the [support policy](https://kubernetes.io/docs/setup/release/version-skew-policy/).**
## AWS collector
The AWS collector allows scaling based on external metrics exposed by AWS
@@ -340,9 +362,9 @@ PolicyDocument:
### Supported metrics
| Metric | Description | Type |
| ------------ | ------- | -- |
| `sqs-queue-length` | Scale based on SQS queue length | External |
| Metric | Description | Type | K8s Versions |
| ------------ | ------- | -- | -- |
| `sqs-queue-length` | Scale based on SQS queue length | External | `>=1.10` |
### Example
@@ -388,9 +410,9 @@ The ZMON collector allows scaling based on external metrics exposed by
### Supported metrics
| Metric | Description | Type |
| ------------ | ------- | -- |
| `zmon-check` | Scale based on any ZMON check results | External |
| Metric | Description | Type | K8s Versions |
| ------------ | ------- | -- | -- |
| `zmon-check` | Scale based on any ZMON check results | External | `>=1.10` |
### Example
+5 -4
View File
@@ -1,7 +1,8 @@
We acknowledge that every line of code that we write may potentially contain security issues.
We are trying to deal with it responsibly and provide patches as quickly as possible.
We are trying to deal with it responsibly and provide patches as quickly as possible. If you have anything to report to us please use the following channels:
We host our bug bounty program on HackerOne, it is currently private, therefore if you would like to report a vulnerability and get rewarded for it, please ask to join our program by filling this form:
Email: Tech-Security@zalando.de
OR
Submit your vulnerability report through our bug bounty program at: https://hackerone.com/zalando
https://corporate.zalando.com/en/services-and-contact#security-form
You can also send you report via this form if you do not want to join our bug bounty program and just want to report a vulnerability or security issue.
+2 -1
View File
@@ -37,7 +37,8 @@ spec:
apiVersion: extensions/v1beta1
kind: Ingress
name: custom-metrics-consumer
targetValue: 10 # this will be treated as targetAverageValue
averageValue: 10
targetValue: 10 # this must be set, but has no effect if `averageValue` is defined.
- type: External
external:
metricName: sqs-queue-length
+2
View File
@@ -165,6 +165,7 @@ type MetricConfig struct {
ObjectReference custom_metrics.ObjectReference
PerReplica bool
Interval time.Duration
MetricSpec autoscalingv2.MetricSpec
}
// ParseHPAMetrics parses the HPA object into a list of metric configurations.
@@ -206,6 +207,7 @@ func ParseHPAMetrics(hpa *autoscalingv2.HorizontalPodAutoscaler) ([]*MetricConfi
MetricTypeName: typeName,
ObjectReference: ref,
Config: map[string]string{},
MetricSpec: metric,
}
if metric.Type == autoscalingv2.ExternalMetricSourceType &&
-67
View File
@@ -1,67 +0,0 @@
package collector
import (
"fmt"
"strings"
"time"
"k8s.io/apimachinery/pkg/api/resource"
)
// MaxWeightedCollector is a simple aggregator collector that returns the maximum value
// of metrics from all collectors.
type MaxWeightedCollector struct {
collectors []Collector
interval time.Duration
weight float64
}
// NewMaxWeightedCollector initializes a new MaxWeightedCollector.
func NewMaxWeightedCollector(interval time.Duration, weight float64, collectors ...Collector) *MaxWeightedCollector {
return &MaxWeightedCollector{
collectors: collectors,
interval: interval,
weight: weight,
}
}
// GetMetrics gets metrics from all collectors and return the higest value.
func (c *MaxWeightedCollector) GetMetrics() ([]CollectedMetric, error) {
errors := make([]error, 0)
collectedMetrics := make([]CollectedMetric, 0)
for _, collector := range c.collectors {
values, err := collector.GetMetrics()
if err != nil {
if _, ok := err.(NoResultError); ok {
errors = append(errors, err)
continue
}
return nil, err
}
collectedMetrics = append(collectedMetrics, values...)
}
if len(collectedMetrics) == 0 {
if len(errors) == 0 {
return nil, fmt.Errorf("no metrics collected, cannot determine max")
}
errorStrings := make([]string, len(errors))
for i, e := range errors {
errorStrings[i] = e.Error()
}
allErrors := strings.Join(errorStrings, ",")
return nil, fmt.Errorf("could not determine maximum due to errors: %s", allErrors)
}
max := collectedMetrics[0]
for _, value := range collectedMetrics {
if value.Custom.Value.MilliValue() > max.Custom.Value.MilliValue() {
max = value
}
}
max.Custom.Value = *resource.NewMilliQuantity(int64(c.weight*float64(max.Custom.Value.MilliValue())), resource.DecimalSI)
return []CollectedMetric{max}, nil
}
// Interval returns the interval at which the collector should run.
func (c *MaxWeightedCollector) Interval() time.Duration {
return c.interval
}
-99
View File
@@ -1,99 +0,0 @@
package collector
import (
"fmt"
"testing"
"time"
"github.com/stretchr/testify/require"
"k8s.io/apimachinery/pkg/api/resource"
"k8s.io/metrics/pkg/apis/custom_metrics"
)
type dummyCollector struct {
value int64
}
func (c dummyCollector) Interval() time.Duration {
return time.Second
}
func (c dummyCollector) GetMetrics() ([]CollectedMetric, error) {
switch c.value {
case 0:
return nil, NoResultError{query: "invalid query"}
case -1:
return nil, fmt.Errorf("test error")
default:
quantity := resource.NewQuantity(c.value, resource.DecimalSI)
return []CollectedMetric{
{
Custom: custom_metrics.MetricValue{
Value: *quantity,
},
},
}, nil
}
}
func TestMaxCollector(t *testing.T) {
for _, tc := range []struct {
name string
values []int64
expected int
weight float64
errored bool
}{
{
name: "basic",
values: []int64{100, 10, 9},
expected: 100,
weight: 1,
errored: false,
},
{
name: "weighted",
values: []int64{100, 10, 9},
expected: 20,
weight: 0.2,
errored: false,
},
{
name: "with error",
values: []int64{10, 9, -1},
weight: 0.5,
errored: true,
},
{
name: "some invalid results",
values: []int64{0, 1, 0, 10, 9},
expected: 5,
weight: 0.5,
errored: false,
},
{
name: "both invalid results and errors",
values: []int64{0, 1, 0, -1, 10, 9},
weight: 0.5,
errored: true,
},
} {
t.Run(tc.name, func(t *testing.T) {
collectors := make([]Collector, len(tc.values))
for i, v := range tc.values {
collectors[i] = dummyCollector{value: v}
}
wc := NewMaxWeightedCollector(time.Second, tc.weight, collectors...)
metrics, err := wc.GetMetrics()
if tc.errored {
require.Error(t, err)
} else {
require.NoError(t, err)
require.Len(t, metrics, 1)
require.EqualValues(t, tc.expected, metrics[0].Custom.Value.Value())
}
})
}
}
+2 -1
View File
@@ -3,6 +3,7 @@ package collector
import (
"context"
"fmt"
"math"
"net/http"
"time"
@@ -146,7 +147,7 @@ func (c *PrometheusCollector) GetMetrics() ([]CollectedMetric, error) {
sampleValue = scalar.Value
}
if sampleValue.String() == "NaN" {
if math.IsNaN(float64(sampleValue)) {
return nil, &NoResultError{query: c.query}
}
+40 -59
View File
@@ -5,6 +5,7 @@ import (
"errors"
"fmt"
"math"
"regexp"
"strings"
"time"
@@ -16,7 +17,7 @@ import (
)
const (
rpsQuery = `scalar(sum(rate(skipper_serve_host_duration_seconds_count{host="%s"}[1m])))`
rpsQuery = `scalar(sum(rate(skipper_serve_host_duration_seconds_count{host=~"%s"}[1m])) * %.4f)`
rpsMetricName = "requests-per-second"
rpsMetricBackendSeparator = ","
)
@@ -72,8 +73,7 @@ type SkipperCollector struct {
}
// NewSkipperCollector initializes a new SkipperCollector.
func NewSkipperCollector(client kubernetes.Interface, plugin CollectorPlugin, hpa *autoscalingv2.HorizontalPodAutoscaler,
config *MetricConfig, interval time.Duration, backendAnnotations []string, backend string) (*SkipperCollector, error) {
func NewSkipperCollector(client kubernetes.Interface, plugin CollectorPlugin, hpa *autoscalingv2.HorizontalPodAutoscaler, config *MetricConfig, interval time.Duration, backendAnnotations []string, backend string) (*SkipperCollector, error) {
return &SkipperCollector{
client: client,
objectReference: config.ObjectReference,
@@ -125,59 +125,36 @@ func getWeights(ingressAnnotations map[string]string, backendAnnotations []strin
// getCollector returns a collector for getting the metrics.
func (c *SkipperCollector) getCollector() (Collector, error) {
var annotations map[string]string
var hosts []string
switch c.objectReference.APIVersion {
case "extensions/v1beta1":
ingress, err := c.client.ExtensionsV1beta1().Ingresses(c.objectReference.Namespace).Get(c.objectReference.Name, metav1.GetOptions{})
if err != nil {
return nil, err
}
annotations = ingress.Annotations
for _, rule := range ingress.Spec.Rules {
hosts = append(hosts, rule.Host)
}
case "networking.k8s.io/v1beta1":
ingress, err := c.client.NetworkingV1beta1().Ingresses(c.objectReference.Namespace).Get(c.objectReference.Name, metav1.GetOptions{})
if err != nil {
return nil, err
}
annotations = ingress.Annotations
for _, rule := range ingress.Spec.Rules {
hosts = append(hosts, rule.Host)
}
ingress, err := c.client.ExtensionsV1beta1().Ingresses(c.objectReference.Namespace).Get(c.objectReference.Name, metav1.GetOptions{})
if err != nil {
return nil, err
}
backendWeight, err := getWeights(annotations, c.backendAnnotations, c.backend)
backendWeight, err := getWeights(ingress.Annotations, c.backendAnnotations, c.backend)
if err != nil {
return nil, err
}
config := c.config
var collector Collector
collectors := make([]Collector, 0, len(hosts))
for _, host := range hosts {
host := strings.Replace(host, ".", "_", -1)
config.Config = map[string]string{
"query": fmt.Sprintf(rpsQuery, host),
}
config.PerReplica = false // per replica is handled outside of the prometheus collector
collector, err := c.plugin.NewCollector(c.hpa, &config, c.interval)
if err != nil {
return nil, err
}
collectors = append(collectors, collector)
var escapedHostnames []string
for _, rule := range ingress.Spec.Rules {
escapedHostnames = append(escapedHostnames, regexp.QuoteMeta(strings.Replace(rule.Host, ".", "_", -1)))
}
if len(collectors) > 0 {
collector = NewMaxWeightedCollector(c.interval, backendWeight, collectors...)
} else {
if len(escapedHostnames) == 0 {
return nil, fmt.Errorf("no hosts defined on ingress %s/%s, unable to create collector", c.objectReference.Namespace, c.objectReference.Name)
}
config.Config = map[string]string{
"query": fmt.Sprintf(rpsQuery, strings.Join(escapedHostnames, "|"), backendWeight),
}
config.PerReplica = false // per replica is handled outside of the prometheus collector
collector, err := c.plugin.NewCollector(c.hpa, &config, c.interval)
if err != nil {
return nil, err
}
return collector, nil
}
@@ -197,22 +174,26 @@ func (c *SkipperCollector) GetMetrics() ([]CollectedMetric, error) {
return nil, fmt.Errorf("expected to only get one metric value, got %d", len(values))
}
// get current replicas for the targeted scale object. This is used to
// calculate an average metric instead of total.
// targetAverageValue will be available in Kubernetes v1.12
// https://github.com/kubernetes/kubernetes/pull/64097
replicas, err := targetRefReplicas(c.client, c.hpa)
if err != nil {
return nil, err
}
if replicas < 1 {
return nil, fmt.Errorf("unable to get average value for %d replicas", replicas)
}
value := values[0]
avgValue := float64(value.Custom.Value.MilliValue()) / float64(replicas)
value.Custom.Value = *resource.NewMilliQuantity(int64(avgValue), resource.DecimalSI)
// For Kubernetes <v1.14 we have to fall back to manual average
if c.config.MetricSpec.Object.Target.AverageValue == nil {
// get current replicas for the targeted scale object. This is used to
// calculate an average metric instead of total.
// targetAverageValue will be available in Kubernetes v1.12
// https://github.com/kubernetes/kubernetes/pull/64097
replicas, err := targetRefReplicas(c.client, c.hpa)
if err != nil {
return nil, err
}
if replicas < 1 {
return nil, fmt.Errorf("unable to get average value for %d replicas", replicas)
}
avgValue := float64(value.Custom.Value.MilliValue()) / float64(replicas)
value.Custom.Value = *resource.NewMilliQuantity(int64(avgValue), resource.DecimalSI)
}
return []CollectedMetric{value}, nil
}
File diff suppressed because it is too large Load Diff