15. Monitoring and Alerting

15.1. Internal Monitoring
15.1.1. System monitoring via Telegraf
15.1.2. NGCP-specific monitoring via ngcp-witnessd
15.1.3. Monitoring data in InfluxDB
15.2. Monitoring data in Redis
15.3. Statistics Dashboard

15.1. Internal Monitoring

15.1.1. System monitoring via Telegraf

The platform uses the internal telegraf service to monitor many aspects of the system, including CPU, memory, swap, disk, filesystem, network, processes, NTP, Nginx, Redis and MySQL.

The gathered information is stored in InfluxDB, in the telegraf database.

15.1.2. NGCP-specific monitoring via ngcp-witnessd

The platform uses the internal ngcp-witnessd service to monitor NGCP-specific metrics or system metrics currently not tracked by telegraf, including memory, process count, Heartbeat, MTA, Kamailio, SIP and MySQL.

The gathered information is stored in InfluxDB, in the ngcp database.

15.1.3. Monitoring data in InfluxDB

The platform uses InfluxDB as a time series database, to store most of the metrics collected in the system.

The monitoring data is used by various components of the platform, including ngcp-collective-check, ngcp-snmp-agent and by the statistics dashboard powered by Grafana.

The monitoring data can also be accessed directly by various means; by using the influx command-line tool in CLI or TUI modes; by using the ngcp-influxdb-extract wrapper which provides two convenience commands to run arbitrary queries or to fetch the last value for a measurement’s field; or by using the HTTP API with curl (or other HTTP fetchers), or with the Sipwise::InfluxDB::HTTP perl module.

tip

See https://docs.influxdata.com/influxdb/v1.1/query_language/spec/ for information about InfluxQL, the query language used by InfluxDB.

tip

To get the list of all measurements for a specific database the following query can be used SHOW MEASUREMENTS.

tip

To get the list of fields for a specific measurement the following query can be used SELECT LAST(*) FROM "measurement".

tip

To get the list of tags for a specific measurement the following query can be used SHOW TAG KEYS FROM "measurement", and for all the current tag values for a tag SHOW TAG VALUES FROM "measurement" WITH KEY = "tag".

15.2. Monitoring data in Redis

The platform uses Redis to store some of the monitoring data, mostly due to historical reasons, as the previously used RRD files did not make it possible to store anything other than numbers.

See Section 2.1, “Redis monitoring keys” for detailed information about the list of data currently stored in the Redis monitoring database.

info

These keys are being phased out, and will be moved to InfluxDB.

15.3. Statistics Dashboard

The platform’s administration interface (described in Section 5, “VoIP Service Configuration Scenario”) provides a graphical overview based on Grafana of the most important system health indicators, such as memory usage, load averages and disk usage. VoIP statistics, such as the number of concurrent active calls, the number of provisioned and registered subscribers, etc. is also present.