PrometheusIntermediateCheatSheet2026|AdvancedPromQL+RecordingRulesGuide
Prometheus intermediate complete: Advanced PromQL queries production-ready, recording rules tutorial, cardinality resolution, federation & optimization best practices. Encyclopedic reference for operators and SREs.
Last Update: 2025-12-03 - Created: 2025-12-03
On This Page
Quick Start with Prometheus intermediate
Production-ready compilation flags and build commands
Advanced PromQL: QUICK START (5s)
Copy → Paste → Live
Instant results (pre-computed) or calculated vector with job dimension. Learn more in advanced vector matching section
When to Use Prometheus intermediate
Decision matrix per scegliere la tecnologia giusta
IDEAL USE CASES
Large-scale multi-cluster monitoring with Prometheus federation across multiple data centers
High-cardinality metric optimization for >10M time series with advanced label strategies and relabeling
Complex alerting workflows with multi-condition rules, dependency tracking, and dynamic routing to multiple receivers
AVOID FOR
Real-time push-based metrics (Prometheus uses pull; use remote-write for push scenarios)
Unbounded dimensional metrics without metric_relabel_configs causing memory exhaustion
Single-cluster deployments without high availability; federation overhead not justified
Core Concepts of Prometheus intermediate
Production-ready compilation flags and build commands
Advanced Vector Matching: on() & group_left() Operators
Vector matching aligns metrics with different label sets. 'on()' specifies common labels; 'group_left()' carries non-matching labels from left vector. Critical for multi-dimensional calculations like error_rate = errors/total.
Dimension mismatch: sum(errors) / sum(requests) fails if label sets don't align
Use 'on' clause: sum(errors) / on(job,instance) group_left() sum(requests)Recording Rules: Pre-compute Expensive Queries for Dashboard Speed
Recording rules run on evaluation_interval (e.g., 15s), pre-computing expensive queries and storing results as new metrics. Dashboards query pre-computed results instead of running expensive calculations repeatedly. Reduces query latency from 2-5s to 10-50ms.
Prometheus Federation: Multi-Cluster Aggregation & Hierarchical Scraping
Federation allows parent Prometheus to scrape /federate endpoint from child Prometheus instances. Enables hierarchical monitoring, cross-cluster alerting, and isolation of scrape load. Parent aggregates metrics from multiple clusters for org-wide dashboards.
Scraping all child metrics into parent causing cardinality explosion in parent
Use federation with match[] parameter: /federate?match[]=job_cpu:usage:rate5m (scrape only aggregated metrics)Label Relabeling: Transform Labels with regex_replace & Metric Renaming
metric_relabel_configs transforms labels post-scrape using regex patterns. Drop high-cardinality labels, rename labels, copy values between labels. Prevents cardinality explosion and enables label normalization across different exporters.
Applying relabeling to low-cardinality metric instead of high-cardinality source
Apply metric_relabel_configs to source: metric_relabel_configs targets specific high-cardinality metricsQuery Optimization: Binary Search with Subqueries & Caching
Subqueries run inner query at different time points, enabling sliding window aggregations and time-lag comparisons. combined with caching via recording rules, reduces repeated query computation.