Thread: finding out what's generating WALs

finding out what's generating WALs

From
Chris Withers
Date:
Hi All,

With a 9.4 cluster, what's the best way to find out what's generating 
the most WAL?

I'm looking after a multi-tenant PG 9.4 cluster, and we've started 
getting alerts for the number of WALs on the server.
It'd be great to understand what's generating all that WAL and what's 
likely to be causing any problems.\

More generally, what's number of WALs is "too much"? check_postgres.pl 
when used in nagios format only appears to be able to alert on absolute 
thresholds, does this always make sense? What's a good threshold to 
alert on?

cheers,

Chris


Re: finding out what's generating WALs

From
Achilleas Mantzios
Date:
On 11/12/18 4:00 μ.μ., Chris Withers wrote:
> Hi All,
>
> With a 9.4 cluster, what's the best way to find out what's generating the most WAL?
>
> I'm looking after a multi-tenant PG 9.4 cluster, and we've started getting alerts for the number of WALs on the
server.
> It'd be great to understand what's generating all that WAL and what's likely to be causing any problems.\
>

One way is to keep snapshots of pg_stat_user_tables and then try to identify spikes based on the various _tup fields.
Another way is to take a look in your archive (where you keep your archived wals), try to identify a period where
excessivewals were generated and then use 
 
https://www.postgresql.org/docs/11/pgwaldump.html to see what's in there.

> More generally, what's number of WALs is "too much"? check_postgres.pl when used in nagios format only appears to be
ableto alert on absolute thresholds, does this always make sense? What's a good 
 
> threshold to alert on?
>

Regarding you wals in pg_wal,  a good threshold could be anything more than a e.g. 10% increase from wal_keep_segments
witha trend to go up. If this number goes up chances are something bad is happening.
 

> cheers,
>
> Chris
>


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



why would postgres be throttling a streaming replication slot'ssending?

From
Chris Withers
Date:
On 11/12/2018 14:48, Achilleas Mantzios wrote:
> On 11/12/18 4:00 μ.μ., Chris Withers wrote:
>>
>> I'm looking after a multi-tenant PG 9.4 cluster, and we've started 
>> getting alerts for the number of WALs on the server.
>> It'd be great to understand what's generating all that WAL and what's 
>> likely to be causing any problems.\
> 
> Regarding you wals in pg_wal,  a good threshold could be anything more 
> than a e.g. 10% increase from wal_keep_segments with a trend to go up. 
> If this number goes up chances are something bad is happening.

wal_keep_segments is 0 here, so I went hunting in pg_replication_slots 
and found that it's the barman slot that's behind on the WAL.

strace'ing on the barman receiving side shows it waiting on the socket, 
so no problem on that side.

on the sending side, it's a little confusing, since the server is 
basically idle, no cpu or disk activity, and yet strace'ing the sending 
process attached to the other end of the socket shows time being spent 
waiting on a poll which, while it includes the socket being sent to, 
also includes a bunch of pipes. I've attached a chunk of the output 
below in the hope that someone on this list could offer an explanation 
as to what might cause the WAL to be trickling over this port rather 
than catching up as fast as it can?

cheers,

Chris

strace output for the streaming_barman slot servicing the barman 
replication slot:

--- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=3577, si_uid=26} ---
write(4, "\0", 1)                       = 1
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
read(3, "\0", 16)                       = 1
read(6, 0x7fffffffdcd7, 1)              = -1 EAGAIN (Resource 
temporarily unavailable)
recvfrom(10, 0x7fffffffdcaf, 1, 0, NULL, NULL) = -1 EAGAIN (Resource 
temporarily unavailable)
read(5, 
"\3\1\0\0\7\326\355\2\343\0\0\0@\n\0\0\0\17\1\230\313\4\0\0)\347\372l\0\0\0\0"..., 
4272) = 4272
sendto(10, "d\0\0\20\315w\0\0\4\313\230\1\17P\0\0\4\313\230\1 
\0\0\2\37\321\343\345\306\20\3\1"..., 4302, 0, NULL, 0) = 4302
read(3, 0x7fffffffdc90, 16)             = -1 EAGAIN (Resource 
temporarily unavailable)
poll([{fd=10, events=POLLIN}, {fd=3, events=POLLIN}, {fd=6, 
events=POLLIN}], 3, 26076) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
--- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=3577, si_uid=26} ---
write(4, "\0", 1)                       = 1
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
read(3, "\0", 16)                       = 1
read(6, 0x7fffffffdcd7, 1)              = -1 EAGAIN (Resource 
temporarily unavailable)
recvfrom(10, 0x7fffffffdcaf, 1, 0, NULL, NULL) = -1 EAGAIN (Resource 
temporarily unavailable)
read(5, "~\320\5\0\3\0\0\0\0 
\1\230\313\4\0\0\3\0\0\0\0\0\0\0\323\316\314\0\0\0\0\0"..., 8192) = 8192
sendto(10, "d\0\0 \35w\0\0\4\313\230\1 
\0\0\0\4\313\230\1@\0\0\2\37\321\343\361\376\226~\320"..., 8222, 0, 
NULL, 0) = 8222
read(3, 0x7fffffffdc90, 16)             = -1 EAGAIN (Resource 
temporarily unavailable)
poll([{fd=10, events=POLLIN}, {fd=3, events=POLLIN}, {fd=6, 
events=POLLIN}], 3, 25275) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
--- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=3577, si_uid=26} ---

Interesting numbers:
- process 3577 is the wal writer process
- fd 10 is the socket, 3 and 6 are pipes
- time is visibly spent sitting on that poll
- the network is only seeing 2Mbit/sec, which is nothing

cheers,

Chris


Re: why would postgres be throttling a streaming replication slot'ssending?

From
Achilleas Mantzios
Date:
On 12/12/18 3:45 μ.μ., Chris Withers wrote:
> On 11/12/2018 14:48, Achilleas Mantzios wrote:
>> On 11/12/18 4:00 μ.μ., Chris Withers wrote:
>>>
>>> I'm looking after a multi-tenant PG 9.4 cluster, and we've started getting alerts for the number of WALs on the
server.
>>> It'd be great to understand what's generating all that WAL and what's likely to be causing any problems.\
>>
>> Regarding you wals in pg_wal,  a good threshold could be anything more than a e.g. 10% increase from
wal_keep_segmentswith a trend to go up. If this number goes up chances are something bad is 
 
>> happening.
>
> wal_keep_segments is 0 here, so I went hunting in pg_replication_slots and found that it's the barman slot that's
behindon the WAL.
 
>

Then your number WALs depends on the traffic generated after the previous checkpoint, the status of archiving and IMO
thestatus of replication slots. The doc : 
 
https://www.postgresql.org/docs/11/runtime-config-replication.html doesn't specifically imply this last one, but it is
documentedhere : https://www.postgresql.org/docs/10/wal-configuration.html , and 
 
it has happened to me. If a replication client with replication_slot dies, WALs accumulate till it wakes up again.
(casein pgsql 10)
 

> strace'ing on the barman receiving side shows it waiting on the socket, so no problem on that side.
>
> on the sending side, it's a little confusing, since the server is basically idle, no cpu or disk activity, and yet
strace'ingthe sending process attached to the other end of the socket shows time 
 
> being spent waiting on a poll which, while it includes the socket being sent to, also includes a bunch of pipes. I've
attacheda chunk of the output below in the hope that someone on this list could 
 
> offer an explanation as to what might cause the WAL to be trickling over this port rather than catching up as fast as
itcan?
 

pg_stat_replication is what you should use for monitoring. IIRC you wont see some cols in the row of the barman.
Also you could query pg_replication_slots as :

select pg_wal_lsn_diff(pg_current_wal_lsn(),confirmed_flush_lsn),pg_wal_lsn_diff(pg_current_wal_lsn(),restart_lsn) from
pg_replication_slotswhere active_pid=<the pid reported above by 
 
pg_stat_replication>;
  pg_wal_lsn_diff | pg_wal_lsn_diff
-----------------+-----------------
              312 |         1917736
(1 row)

In the first column you should see diff values (in bytes) close to zero. In the second you should see the length
(window)of your replication slot in bytes, which means that this converted to 16MB 
 
should indicate an excess of the number of wals due to the replication slot in addition to the other factors
(archiving,checkpoints). In my system it never reaches the size of a single wal.
 

>
> cheers,
>
> Chris
>
> strace output for the streaming_barman slot servicing the barman replication slot:

IMO no need to go deep to gdb and stack traces unless you are sure there is a problem further down.

>
> --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=3577, si_uid=26} ---
> write(4, "\0", 1)                       = 1
> rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
> read(3, "\0", 16)                       = 1
> read(6, 0x7fffffffdcd7, 1)              = -1 EAGAIN (Resource temporarily unavailable)
> recvfrom(10, 0x7fffffffdcaf, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
> read(5, "\3\1\0\0\7\326\355\2\343\0\0\0@\n\0\0\0\17\1\230\313\4\0\0)\347\372l\0\0\0\0"..., 4272) = 4272
> sendto(10, "d\0\0\20\315w\0\0\4\313\230\1\17P\0\0\4\313\230\1 \0\0\2\37\321\343\345\306\20\3\1"..., 4302, 0, NULL, 0)
=4302
 
> read(3, 0x7fffffffdc90, 16)             = -1 EAGAIN (Resource temporarily unavailable)
> poll([{fd=10, events=POLLIN}, {fd=3, events=POLLIN}, {fd=6, events=POLLIN}], 3, 26076) = ? ERESTART_RESTARTBLOCK
(Interruptedby signal)
 
> --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=3577, si_uid=26} ---
> write(4, "\0", 1)                       = 1
> rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
> read(3, "\0", 16)                       = 1
> read(6, 0x7fffffffdcd7, 1)              = -1 EAGAIN (Resource temporarily unavailable)
> recvfrom(10, 0x7fffffffdcaf, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
> read(5, "~\320\5\0\3\0\0\0\0 \1\230\313\4\0\0\3\0\0\0\0\0\0\0\323\316\314\0\0\0\0\0"..., 8192) = 8192
> sendto(10, "d\0\0 \35w\0\0\4\313\230\1 \0\0\0\4\313\230\1@\0\0\2\37\321\343\361\376\226~\320"..., 8222, 0, NULL, 0) =
8222
> read(3, 0x7fffffffdc90, 16)             = -1 EAGAIN (Resource temporarily unavailable)
> poll([{fd=10, events=POLLIN}, {fd=3, events=POLLIN}, {fd=6, events=POLLIN}], 3, 25275) = ? ERESTART_RESTARTBLOCK
(Interruptedby signal)
 
> --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=3577, si_uid=26} ---
>
> Interesting numbers:
> - process 3577 is the wal writer process
> - fd 10 is the socket, 3 and 6 are pipes
> - time is visibly spent sitting on that poll
> - the network is only seeing 2Mbit/sec, which is nothing
>
> cheers,
>
> Chris
>


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt