On 11/12/18 4:00 μ.μ., Chris Withers wrote:
> Hi All,
>
> With a 9.4 cluster, what's the best way to find out what's generating the most WAL?
>
> I'm looking after a multi-tenant PG 9.4 cluster, and we've started getting alerts for the number of WALs on the
server.
> It'd be great to understand what's generating all that WAL and what's likely to be causing any problems.\
>
One way is to keep snapshots of pg_stat_user_tables and then try to identify spikes based on the various _tup fields.
Another way is to take a look in your archive (where you keep your archived wals), try to identify a period where
excessivewals were generated and then use
https://www.postgresql.org/docs/11/pgwaldump.html to see what's in there.
> More generally, what's number of WALs is "too much"? check_postgres.pl when used in nagios format only appears to be
ableto alert on absolute thresholds, does this always make sense? What's a good
> threshold to alert on?
>
Regarding you wals in pg_wal, a good threshold could be anything more than a e.g. 10% increase from wal_keep_segments
witha trend to go up. If this number goes up chances are something bad is happening.
> cheers,
>
> Chris
>
--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt