Inspect Prometheus Metrics

For Inspect Prometheus Metrics use @prometheus_metrics decorator.

For using default metrics benchmark_runner.common.prometheus.metrics-default.yaml: [This will collect the metric for run() method]

from benchmark_runner.common.prometheus.prometheus_metrics import prometheus_metrics

@prometheus_metrics
def run():
    print('sleep 30 sec')
    time.sleep(30)
run()

For using custom metrics file, pass it as argument to decorator: [This will collect custom metrics for run() method]

current_dir = os.path.dirname(os.path.abspath(__file__))
yaml_full_path = os.path.join(current_dir, 'metrics.yaml')


@prometheus_metrics(yaml_full_path=yaml_full_path)
def run():
    print('sleep 30 sec')
    time.sleep(30)
run()

Inspect Prometheus Snapshots

The CI jobs store snapshots of the Prometheus database for each run as part of the artifacts.
Download run artifacts from S3:

$ wget https://myuser:mypassword@run-artifacts-perfci.com
$ tar -xvf $(pwd)/hammerdb-vm-mariadb-2022-01-04-08-21-23.tar.gz

Within the artifact directory is a Prometheus snapshot directory named:

promdb-YYYY_MM_DDTHH_mm_ss+0000_YYYY_MM_DDTHH_mm_ss+0000.tar

The timestamps are for the start and end of the metrics capture; they are stored in UTC time (+0000). It is possible to run containerized Prometheus on it to inspect the metrics. Note that Prometheus requires write access to its database, so it will actually write to the snapshot. So for example if you have downloaded artifacts for a run named hammerdb-vm-mariadb-2022-01-04-08-21-23 and the Prometheus snapshot within is named promdb_2022_01_04T08_21_52+0000_2022_01_04T08_45_47+0000, you could run as follows:

$ tar -xvf $(pwd)/hammerdb-vm-mariadb-2022-01-04-08-21-23/promdb_2022_01_04T08_21_52+0000_2022_01_04T08_45_47+0000.tar
$ local_prometheus_snapshot=$(pwd)/hammerdb-vm-mariadb-2022-01-04-08-21-23/promdb_2022_01_04T08_21_52+0000_2022_01_04T08_45_47+0000
$ chmod -R g-s,a+rw "$local_prometheus_snapshot"
$ sudo podman pod create --name prometheus_grafana_pod -p 9090:9090 -p 3000:3000
$ sudo podman run --name grafana --pod prometheus_grafana_pod -d --name=grafana grafana/grafana
$ sudo podman run --name prometheus --pod prometheus_grafana_pod -uroot -v "$local_prometheus_snapshot":/prometheus --privileged prom/prometheus --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/prometheus --storage.tsdb.retention.time=100000d --storage.tsdb.retention.size=1000PB

For removing pod:
$ sudo podman pod rm -f prometheus_grafana_pod

1. Open http://localhost:9090/, you can run queries against it
2. Open http://localhost:3000/, add prometheus data source and you can run queries against it
3. Grafana: login with admin/admin and create prometheus data source: http://localhost:9090, create panel, select time and run query as code
** Important: please select the exact file time: 2022_01_04T08_21_52+0000_2022_01_04T08_45_47+0000

Example query:

sum(irate(node_cpu_seconds_total[2m])) by (mode,instance) > 0

OpenShift example queries


It is important to use the `--storage.tsdb.retention.time` option to
Prometheus, as otherwise Prometheus may discard the data in the
snapshot.  And note that you must set the time bounds on the
Prometheus query to fit the start and end times as recorded in the
name of the promdb snapshot.