17. Monitoring and Alerting

17.1. Internal Monitoring

17.1.1. System monitoring via Telegraf

The platform uses the internal telegraf service to monitor many aspects of the system, including CPU, memory, swap, disk, filesystem, network, processes, NTP, Nginx, Redis and MySQL.

The gathered information is stored in InfluxDB, in the telegraf database.

17.1.2. Sipwise C5 specific monitoring via ngcp-witnessd

The platform uses the internal ngcp-witnessd service to monitor Sipwise C5 specific metrics or system metrics currently not tracked by telegraf, including memory, process count, Heartbeat, MTA, Kamailio, SIP and MySQL.

The gathered information is stored in InfluxDB, in the ngcp database.

17.1.3. Monitoring data in InfluxDB

The platform uses InfluxDB as a time series database, to store most of the metrics collected in the system.

The monitoring data is used by various components of the platform, including ngcp-collective-check, ngcp-snmp-agent and by the statistics dashboard powered by Grafana.

The monitoring data can also be accessed directly by various means; by using the influx command-line tool in CLI or TUI modes; by using the ngcp-influxdb-extract wrapper which provides two convenience commands to run arbitrary queries or to fetch the last value for a measurement’s field; or by using the HTTP API with curl (or other HTTP fetchers), or with the Sipwise::InfluxDB::HTTP perl module.

See https://docs.influxdata.com/influxdb/v1.1/query_language/spec/ for information about InfluxQL, the query language used by InfluxDB.

tip

To get the list of all measurements for a specific database the following query can be used SHOW MEASUREMENTS.

tip

To get the list of fields for a specific measurement the following query can be used SELECT LAST(*) FROM "measurement".

tip

To get the list of tags for a specific measurement the following query can be used SHOW TAG KEYS FROM "measurement", and for all the current tag values for a tag SHOW TAG VALUES FROM "measurement" WITH KEY = "tag".

See Section 2.1, “InfluxDB monitoring keys” for detailed information about the list of data currently stored in the InfluxDB ngcp monitoring database.

17.2. Statistics Dashboard

The platform’s administration interface (described in Section 6, “VoIP Service Configuration Scenario”) provides a graphical overview based on Grafana of the most important system health indicators, such as memory usage, load averages and disk usage. VoIP statistics, such as the number of concurrent active calls, the number of provisioned and registered subscribers, etc. is also present.