Overview Cluster Monitoring
Monitoring services are managed using Services > Monitoring in EXAoperation. You can create and configure monitoring services, and you can set the warning and error threshold values for disk usage, swap usage and load.
EXACluster Monitoring Services
You can create log services that include cluster events such as:
- EXAoperation events
- Database events
- EXAStorage events
- Load on cluster nodes
- Cluster process events
- Authentication events
Additionally, you can also forward the log service messages to other monitoring systems.
The following table describes the properties of a log service.
Property | Description |
---|---|
Minimum Log Priority |
The minimum syslog severity that should be included in the log service. The available log priorities (and their meanings) are:
You specify the minimum severity that the log service will report on - for example, if you select 'Warning', only warning or error events will be shown, but no information or notice events. |
EXAClusterOS Services |
|
Database Systems | The databases that should be reported on by the log service. |
Remote Syslog Server | The IP address or DNS name of the remote server if you forward the log service to an external server. |
Remote Syslog Protocol | The network protocol to use if you forward the log service to an external server. |
Default Time Interval | The default period of time for which the log service will display events. Enter a value followed by a unit of time (e.g. '1d', '2h', '15m', '100s') |
Description | A description of the log service. |
The best practice is to have at least two log services with a minimum log priority of Information, one that includes LOAD events, and one that excludes LOAD events. This is because the volume of LOAD events could obscure reporting from other events. The setup for both log services can be:
INFORMATION (WITH LOAD)
- Minimum Log Priority: Information
- EXAClusterOS Services: ALL
- Database Systems: ALL
- Default Time Interval: 10m
- Description: Information ALL
INFORMATION (WITHOUT LOAD)
- Minimum Log Priority: Information
- EXAClusterOS Services: ALL except
- LOAD Database Systems: ALL
- Default Time Interval: 10m
- Description: Information ALL
Threshold Values
You can set error and warning threshold values for the following:
Threshold | Description |
---|---|
Disk usage | The warning and error threshold values for disk usage, expressed as a percentage. Disk usage thresholds apply only to filesystems such as OS and DATA, and not to EXAStorage. |
Swap usage | The warning and error threshold values for swap usage. |
Load |
The warning and error threshold values for CPU load in the cluster. A good starting point for setting the load threshold is to use the following calculation: Quantity of Threads per data node * 1.5 = Warning Threshold. For example, in a scenario where each data node has 2 x sockets with 6 x cores and hyperthreading, the calculation would be:
|
Service States
Service states provides an overview of the cluster services.
During the installation process, all services will be shown as 'OK' except for Storage. This is because EXAStorage does not start automatically, and is not yet configured.
Even though the time is automatically synchronized across the cluster at regular intervals, you can choose to manually force a synchronization using the Synchronize Time button, see Synchronize Time with NTP Server (optional)