45.1. Pre-Configuring Alerts #
To work with alerts, you must first pre-configure them in the ppem-manager.yml manager configuration file.
You can specify the following parameters:
alerts:
metrics:
request_chunk_size: number_of_instance_IDs
cleanup_grace_period: alert_cleanup_interval_if_no_data_is_received
scheduler:
interval: interval_for_checking_new_alerts
initial_delay: delay_for_starting_alert_scheduler
timeout: timeout_for_updating_alert_trigger_rules
delayed_data:
is_enabled: true or false
data_delay: default_data_arrival_delay_for_all_sources
datasource_delays:
metrics: delay_for_metrics_arrival
logs: delay_for_log_arrival
max_delay: maximum_allowed_data_arrival_delay
is_adaptive_delay: true or false
notifier:
num_workers: number_of_concurrent_workers
worker_batch_size: number_of_alerts_in_one_batch
worker_interval: interval_for_checking_new_alerts
backoff_base: exponential_backoff_calculation_duration
max_retries: maximum_number_of_alert_attempts
notification_timeout: alert_timeout
janitor_interval: janitor_worker_polling_interval
stale_processing_timeout: stale_alert_processing_timeout
email:
is_enabled: true or false
smtp:
host: SMTP_server_hostname_or_IP
port: SMTP_server_port
username: username_for_SMTP_server_authentication
password: password_for_SMTP_server_authentication
from: alert_sender_email
timeout: SMTP_server_connection_timeout
use_starttls: true or false
use_ssl: true or false
tls:
insecure_skip_verify: true or false
root_ca_path: path_to_root_CA
Where:
metrics: The parameters of sending requests to the metrics plugin.request_chunk_size: The maximum number of instance IDs within one request.Default value:
100.
cleanup_grace_period: The interval after which alerts are cleaned up if no data is received.Default value:
6h.scheduler: The parameters of the scheduler that updates alerts in the manager memory.interval: The interval for the scheduler to check for new alerts to process.Default value:
50s.initial_delay: The delay before starting the scheduler for the first time after the start of PPEM.Default value:
10s.timeout: The scheduler timeout for updating alert trigger rules.Default value:
10m.
delayed_data: The parameters for managing delayed metrics and logs with unknown delay time.is_enabled: Specifies whether delayed data management parameters are enabled.Possible values:
truefalse
If
trueis specified, PPEM checks for delayed metrics and logs.Default value:
false.data_delay: The default data delay for all data sources when specific delays are not configured.Default value:
180s.datasource_delays: The data delay for specific data sources. This parameter allows specifying different delays for metrics and logs as they may arrive at different rates.Possible values:
metrics: The delay for the metrics arrival, in seconds. Metrics typically have more consistent collection intervals but may be delayed due to network or processing issues.logs: The delay for the log arrival, in seconds. Logs may arrive more frequently but with higher variability in timing due to log rotation and processing.
max_delay: The maximum allowed delay to prevent processing data that is too old. Data found earlier than this number of seconds is ignored to prevent false alerts from stale data.Default value:
600s.is_adaptive_delay: Enables or disables the adaptive delay learning based on observed data arrival patterns.Possible values:
truefalse
When enabled, PPEM learns on actual delays from data timestamps and adjusts the lookback window dynamically.
Default value:
true.
notifier: The parameters of the notifier that sends alerts.num_workers: The number of concurrent workers that will send alerts.Default value:
5.worker_batch_size: The number of alerts processed by workers in one batch.Default value:
20.worker_interval: The polling interval for workers to check for new alerts in the repository database.Default value:
30s.backoff_base: The base duration for the exponential backoff calculation when resending a failed alert.The delay for resending the alert is calculated as:
backoff_baseX (2^number_of_retry_attempts).Default value:
10s.max_retries: The maximum number of attempts to resend a failed alert.Default value:
3.notification_timeout: The maximum amount of time for the notifier to wait for an alert to be sent before considering it failed.Default value:
20s.janitor_interval: The polling interval for the janitor worker that cleans alerts stuck in the processing state.Default value:
1m.stale_processing_timeout: The amount of time after which alerts stuck in the processing state are considered stale and must be reset by the janitor worker.Default value:
10m.
email: The parameters of sending alerts via email.is_enabled: Specifies whether alerts are sent via email.Possible values:
truefalse
If
falseis specified, alerts are logged instead of being sent via email.Default value:
false.smtp: The parameters of the SMTP server used for sending alerts.host: The hostname or IP address of the SMTP server.Default value:
localhost.port: The port number of the SMTP server.Default value:
25.username: The username for authenticating in the SMTP server.Default value:
"".password: The password for authenticating in the SMTP server.Default value:
"".from: The email address of the alert sender.Default value:
admin@localdomain.local.timeout: The SMTP server connection timeout.Default value:
10s.use_starttls: Specifies whether the STARTTLS extension is used for securing the SMTP server connection.Possible values:
truefalse
Default value:
false.use_ssl: Specifies whether the SSL/TLS protocol is used for the SMTP server connection.Possible values:
truefalse
Default value:
false.tls: The TLS protocol parameters.insecure_skip_verify: Specifies whether the client skips the verification of the certificate chain and hostname of the SMTP server.Possible values:
truefalse
Default value:
false.Important
Setting this parameter to
truerepresents a security risk. Do it only for testing purposes or with trusted networks.root_ca_path: The path to the CA certificate used for verifying the certificate of the SMTP server.Default value:
"".