Skip to main content

Prometheus Stack

Prometheus is an open-source systems monitoring and alerting toolkit that collects and stores metrics as time series data. This means metrics information is stored with the timestamp at which it was recorded, along with optional key-value pairs.

The stack consists of four main services:

  • Prometheus – Collects metrics and triggers alerts.
  • AlertManager – Sends alerts to users.
  • Prometheus Push Gateway – Allows push metrics via REST API.
  • Prometheus Black Box – Allows blackbox probing of endpoints over HTTP, HTTPS.

Deployment

The Prometheus Stack runs on a Kubernetes cluster (currently a single node) in the ORC cloud. It is deployed via a CI/CD job using a Helm chart.

To run this job, go to GitLab Pipelines and execute the prometheus or prometheus push gateway job stage.

note
  • The SSH key for the VM can be found in GitLab CI/CD Variables. You may need to request access.
  • The Kubernetes config file is also available in GitLab CI/CD Variables. :::

Host

The Kubernetes cluster is provisioned through our infrastructure repository. The VM's IP address can be found in the Ansible variable prometheus_push_gateway.

The current infrastructure repository lacks proper support and documentation. In the future, it should be replaced by a monorepo or transitioned to a production Kubernetes cluster. :::

Deployment Details

Deployment is managed through our deployment monorepo in the monitoring folder.

There are three key files to update:

Take a look how the current alerts are created and modify/add new ones in a similar manner.

Prometheus Push Gateway

The Pushgateway is an intermediary service for pushing metrics from jobs that cannot be scraped directly.

Push Gateway runs as a web service (see address here), providing a dashboard to view the pushed metrics. Instead of exposing a /metrics endpoint, it allows metrics to be pushed manually.

We use it to send results from system and tool tests using the prometheus_client Python package.

Prometheus Black Box

The Blackbox Exporter allows black-box probing of endpoints over HTTP, HTTPS, DNS, TCP, ICMP, and gRPC. It is installed alongside the rest of the Prometheus stack and is used to monitor web services that do not expose metrics (or as a supplement to existing metrics). For example, it can check whether a service returns an HTTP status code 200.