Hi,
In production environments WAL receiver connection attempts to primary
may fail for many reasons (primary down, network is broken,
authentication tokens changes, primary_conn_info modifications, socket
errors and so on.). Although we emit the error message to server logs,
isn't it useful to show the last connection error message via
pg_stat_wal_receiver or pg_stat_get_wal_receiver? This will be super
helpful in production environments to analyse what the WAL receiver
issues as accessing and sifting through server logs can be quite
cumbersome for the end users.
Thoughts?
Attached patch can only display the last_conn_error only after the WAL
receiver is up, but it will be good to let pg_stat_wal_receiver emit
last_conn_error even before that. Imagine WAL receiver is continuously
failing on the standby, if we let pg_stat_wal_receiver report
last_conn_error, all other columns will show NULL. I can change this
way, if others are okay with it.
Regards,
Bharath Rupireddy.