45.1. Pre-Configuring Alerts #

To work with alerts, you must first pre-configure them in the ppem-manager.yml manager configuration file.

You can specify the following parameters:

alerts:
  metrics:
    request_chunk_size: number_of_instance_IDs
  cleanup_grace_period: alert_cleanup_interval_if_no_data_is_received
  scheduler:
    interval: interval_for_checking_new_alerts
    initial_delay: delay_for_starting_alert_scheduler
    timeout: timeout_for_updating_alert_trigger_rules
  delayed_data:
    is_enabled: true or false
    data_delay: default_data_arrival_delay_for_all_sources
    datasource_delays:
      metrics: delay_for_metrics_arrival
      logs: delay_for_log_arrival
    max_delay: maximum_allowed_data_arrival_delay
    is_adaptive_delay: true or false
  notifier:
    num_workers: number_of_concurrent_workers
    worker_batch_size: number_of_alerts_in_one_batch
    worker_interval: interval_for_checking_new_alerts
    backoff_base: exponential_backoff_calculation_duration
    max_retries: maximum_number_of_alert_attempts
    notification_timeout: alert_timeout
    janitor_interval: janitor_worker_polling_interval
    stale_processing_timeout: stale_alert_processing_timeout
  email:
    is_enabled: true or false
    smtp:
      host: SMTP_server_hostname_or_IP
      port: SMTP_server_port
      username: username_for_SMTP_server_authentication
      password: password_for_SMTP_server_authentication
      from: alert_sender_email
      timeout: SMTP_server_connection_timeout
      use_starttls: true or false
      use_ssl: true or false
      tls:
        insecure_skip_verify: true or false
        root_ca_path: path_to_root_CA

Where:

  • metrics: The parameters of sending requests to the metrics plugin.

    • request_chunk_size: The maximum number of instance IDs within one request.

      Default value: 100.

  • cleanup_grace_period: The interval after which alerts are cleaned up if no data is received.

    Default value: 6h.

  • scheduler: The parameters of the scheduler that updates alerts in the manager memory.

    • interval: The interval for the scheduler to check for new alerts to process.

      Default value: 50s.

    • initial_delay: The delay before starting the scheduler for the first time after the start of PPEM.

      Default value: 10s.

    • timeout: The scheduler timeout for updating alert trigger rules.

      Default value: 10m.

  • delayed_data: The parameters for managing delayed metrics and logs with unknown delay time.

    • is_enabled: Specifies whether delayed data management parameters are enabled.

      Possible values:

      • true

      • false

      If true is specified, PPEM checks for delayed metrics and logs.

      Default value: false.

    • data_delay: The default data delay for all data sources when specific delays are not configured.

      Default value: 180s.

    • datasource_delays: The data delay for specific data sources. This parameter allows specifying different delays for metrics and logs as they may arrive at different rates.

      Possible values:

      • metrics: The delay for the metrics arrival, in seconds. Metrics typically have more consistent collection intervals but may be delayed due to network or processing issues.

      • logs: The delay for the log arrival, in seconds. Logs may arrive more frequently but with higher variability in timing due to log rotation and processing.

    • max_delay: The maximum allowed delay to prevent processing data that is too old. Data found earlier than this number of seconds is ignored to prevent false alerts from stale data.

      Default value: 600s.

    • is_adaptive_delay: Enables or disables the adaptive delay learning based on observed data arrival patterns.

      Possible values:

      • true

      • false

      When enabled, PPEM learns on actual delays from data timestamps and adjusts the lookback window dynamically.

      Default value: true.

  • notifier: The parameters of the notifier that sends alerts.

    • num_workers: The number of concurrent workers that will send alerts.

      Default value: 5.

    • worker_batch_size: The number of alerts processed by workers in one batch.

      Default value: 20.

    • worker_interval: The polling interval for workers to check for new alerts in the repository database.

      Default value: 30s.

    • backoff_base: The base duration for the exponential backoff calculation when resending a failed alert.

      The delay for resending the alert is calculated as:

      backoff_base X (2^number_of_retry_attempts).

      Default value: 10s.

    • max_retries: The maximum number of attempts to resend a failed alert.

      Default value: 3.

    • notification_timeout: The maximum amount of time for the notifier to wait for an alert to be sent before considering it failed.

      Default value: 20s.

    • janitor_interval: The polling interval for the janitor worker that cleans alerts stuck in the processing state.

      Default value: 1m.

    • stale_processing_timeout: The amount of time after which alerts stuck in the processing state are considered stale and must be reset by the janitor worker.

      Default value: 10m.

  • email: The parameters of sending alerts via email.

    • is_enabled: Specifies whether alerts are sent via email.

      Possible values:

      • true

      • false

      If false is specified, alerts are logged instead of being sent via email.

      Default value: false.

    • smtp: The parameters of the SMTP server used for sending alerts.

      • host: The hostname or IP address of the SMTP server.

        Default value: localhost.

      • port: The port number of the SMTP server.

        Default value: 25.

      • username: The username for authenticating in the SMTP server.

        Default value: "".

      • password: The password for authenticating in the SMTP server.

        Default value: "".

      • from: The email address of the alert sender.

        Default value: admin@localdomain.local.

      • timeout: The SMTP server connection timeout.

        Default value: 10s.

      • use_starttls: Specifies whether the STARTTLS extension is used for securing the SMTP server connection.

        Possible values:

        • true

        • false

        Default value: false.

      • use_ssl: Specifies whether the SSL/TLS protocol is used for the SMTP server connection.

        Possible values:

        • true

        • false

        Default value: false.

      • tls: The TLS protocol parameters.

        • insecure_skip_verify: Specifies whether the client skips the verification of the certificate chain and hostname of the SMTP server.

          Possible values:

          • true

          • false

          Default value: false.

          Important

          Setting this parameter to true represents a security risk. Do it only for testing purposes or with trusted networks.

        • root_ca_path: The path to the CA certificate used for verifying the certificate of the SMTP server.

          Default value: "".