Labels¶

The prometheus labels are key-value pairs attached to metrics. They offer additional context, information and metadata to the metrics and permit things like identify, filter and categorize metrics.

Example

 metric_name{label1="value1", label2="value2"} value

Metric labels¶

This labels are generated from the application that expose the metrics:

they are embedded in the metric data itself
they describe the metric

They have high cardinality risk

Target labels¶

This labels are added during the scraping process. They describe where the metrics came from, not what they represent.

They have low cardinality risk

Core Target Label: job¶

The job label identifies the scrape configuration.

In prometheus operator we can give the job label the value of a kubernetes label from the associated pod. This is done using spec.jobLabel in a podmonitor or servicemonitor resource.

Core Target Label: instance¶

Instance label is the host and port of the target

More information about Jobs and instances here https://prometheus.io/docs/concepts/jobs_instances/

Internal labels (__)¶

There are some internal labels (prefixed with __) are temporary labels used by Prometheus during the scraping process. They're not stored with metrics but control how scraping happens.

Examples:

__address_ # as the target's original address
__metrics_path__ # as the endpoint path
__scheme__="http" # as the protocol to use
__param_* # as query parameters
__meta_* # Contains discovery metadata
__meta_kubernetes_ # Contains kubernetes discovery metadata
__meta_consul_* # Consul discovery metadata
__meta_ec2_* # EC2 discovery metadata

With Relabeling it is possible to use internal labels to create permanent labels using relabel_configs

There can be other informational labels, not true internal labels so they cannot be used in relabeling

__scrape_interval__ # Shows configured interval
__scrape_timeout__ # Shows configured timeout

Sources of Target Labels¶

Sources of Target Labels can be defined:

using an static configuration

  - job_name: 'api-servers'
    static_configs:
    - targets: ['api1:8080', 'api2:8080']
      labels:
        env: 'production'
        team: 'backend'

using Service Discovery (Kubernetes, Consul, etc.)

 - job_name: 'kubernetes-pods'
    kubernetes_sd_configs:
    - role: pod

Auto-discovers and adds labels like:

  {
    job="kubernetes-pods",
    instance="10.244.0.15:8080",
    __meta_kubernetes_pod_name="webapp-123",
    __meta_kubernetes_namespace="production"
  }

using ServiceMonitor/PodMonitor from prometheus operator

honorLabels¶

honorLabels controls how Prometheus handles label conflicts when scraping metrics

Default behavior (honorLabels: false):

Prometheus overwrites target labels with its own labels
Target's job label gets renamed to exported_job
Target's instance label gets renamed to exported_instance
Prometheus uses its own job and instance values

With honorLabels: true:

Prometheus keeps the target's original labels
Target's labels take precedence over Prometheus labels
No renaming happens

  Example: Target exposes: my_metric{job="custom-job", instance="app-1"}

honorLabels: false (default)
my_metric{job="serviceMonitor/namespace/name", instance="10.0.0.1:8080", exported_job="custom-job", exported_instance="app-1"}

honorLabels: true  
my_metric{job="custom-job", instance="app-1"}

Use honorLabels: true:

When targets provide meaningful job/instance labels
For federation setups
When you want to preserve application-defined labels

Common use case: Kubernetes service discovery often uses honorLabels: true to preserve pod labels as metric labels.

Cardinality risk¶

Cardinality is the number of unique label combinations for a metric.

The problem is each unique combination creates a separate time series. Prometheus stores each series individually.

Cardinality risk is when you create too many unique time series, causing Prometheus performance and storage problems. It can crash or severely degrade your monitoring system.

High Cardinality Risk is when too many label combinations create too many time series. This can be caused by:

Unique Identifiers:

requests{user_id="abc123"}        # millions of users
requests{request_id="xyz789"}     # every request unique
requests{timestamp="1634567890"}  # every second unique

Unbounded Values:

response_time{url="/user/12345/profile"}  # infinite URLs
errors{error_msg="Connection timeout"}    # many error messages

The solutions can be:

Use Bounded Labels (limited values)

requests{user_type="premium"}     # premium, basic, free
requests{status_class="4xx"}      # 2xx, 3xx, 4xx, 5xx

Group/Aggregate:

requests{region="us-east", user_tier="paid"}

Use Histograms for Ranges (instead of exact response times)

http_request_duration_bucket{le="0.1"}  # predefined buckets

Links¶

https://github.com/prometheus-operator/prometheus-operator/issues/3246

relabelings vs metricRelabelings