Home > mailing lists

log_checkpoint's "WAL file(s) added" is misleading to the point of uselessness - Mailing list pgsql-hackers

From	Andres Freund
Subject	log_checkpoint's "WAL file(s) added" is misleading to the point of uselessness
Date	July 24, 2021 22:50:36
Msg-id	20210724225036.fm4etupdtggurehf@alap3.anarazel.de Whole thread Raw
Responses	Re: log_checkpoint's "WAL file(s) added" is misleading to the point of uselessness Re: log_checkpoint's "WAL file(s) added" is misleading to the point of uselessness
List	pgsql-hackers

Tree view

Hi,

I've been repeatedly confused by the the number of WAL files supposedly
added. Even when 100s of new WAL files are created the relevant portion
of log_checkpoints will only ever list zero or one added WAL file.

The reason for that is that CheckpointStats.ckpt_segs_added is only
incremented in PreallocXlogFiles(). Which has the following comment:
 * XXX this is currently extremely conservative, since it forces only one
 * future log segment to exist, and even that only if we are 75% done with
 * the current one.  This is only appropriate for very low-WAL-volume systems.

Whereas in real workloads WAL files are almost exclusively created via
XLogWrite()->XLogFileInit().

I think we should consider just removing that field. Or, even better, show
something accurate instead.

As an example, here's the log output of a workload that has a replication slot
preventing WAL files from being recycled (and too small max_wal_size):

2021-07-24 15:47:42.524 PDT [2251649][checkpointer][:0] LOG:  checkpoint complete: wrote 55767 buffers (42.5%); 0 WAL
file(s)added, 0 removed, 0 recycled; write=3.914 s, sync=0.041 s, total=3.972 s; sync files=10, longest=0.010 s,
average=0.005s; distance=540578 kB, estimate=540905 kB
 
2021-07-24 15:47:46.721 PDT [2251649][checkpointer][:0] LOG:  checkpoint complete: wrote 55806 buffers (42.6%); 1 WAL
file(s)added, 0 removed, 0 recycled; write=3.855 s, sync=0.028 s, total=3.928 s; sync files=8, longest=0.008 s,
average=0.004s; distance=540708 kB, estimate=540886 kB
 
2021-07-24 15:47:51.004 PDT [2251649][checkpointer][:0] LOG:  checkpoint complete: wrote 55850 buffers (42.6%); 1 WAL
file(s)added, 0 removed, 0 recycled; write=3.895 s, sync=0.034 s, total=3.974 s; sync files=9, longest=0.009 s,
average=0.004s; distance=540657 kB, estimate=540863 kB
 
2021-07-24 15:47:55.231 PDT [2251649][checkpointer][:0] LOG:  checkpoint complete: wrote 55879 buffers (42.6%); 0 WAL
file(s)added, 0 removed, 0 recycled; write=3.878 s, sync=0.026 s, total=3.944 s; sync files=9, longest=0.007 s,
average=0.003s; distance=540733 kB, estimate=540850 kB
 
2021-07-24 15:47:59.462 PDT [2251649][checkpointer][:0] LOG:  checkpoint complete: wrote 55847 buffers (42.6%); 1 WAL
file(s)added, 0 removed, 0 recycled; write=3.882 s, sync=0.027 s, total=3.952 s; sync files=9, longest=0.008 s,
average=0.003s; distance=540640 kB, estimate=540829 kB
 

So it's 3 new WAL files in that timeframe, one might think?

A probe instrumenting xlog file creation shows something very different:
perf probe -x /home/andres/build/postgres/dev-assert/vpath/src/backend/postgres -a XLogFileInitInternal:39
(39 is the O_CREAT BasicOpenFile(), not the recycle path).

perf stat -a -e probe_postgres:XLogFileInitInternal_L39 -I 1000
     1.001030943                  9      probe_postgres:XLogFileInitInternal_L39
     2.002998896                  8      probe_postgres:XLogFileInitInternal_L39
     3.005098857                  8      probe_postgres:XLogFileInitInternal_L39
     4.007000426                  6      probe_postgres:XLogFileInitInternal_L39
     5.008423119                  9      probe_postgres:XLogFileInitInternal_L39
     6.013427568                  8      probe_postgres:XLogFileInitInternal_L39
     7.015087613                  8      probe_postgres:XLogFileInitInternal_L39
     8.017393339                  8      probe_postgres:XLogFileInitInternal_L39
     9.018425526                  7      probe_postgres:XLogFileInitInternal_L39
    10.020398520                  5      probe_postgres:XLogFileInitInternal_L39

Greetings,

Andres Freund

pgsql-hackers by date:

From: Bharath Rupireddy
Date: 24 July 2021, 16:26:40
Subject: Re: CREATE SEQUENCE with RESTART option

From: Andres Freund
Date: 24 July 2021, 23:02:07
Subject: Re: log_checkpoint's "WAL file(s) added" is misleading to the point of uselessness

log_checkpoint's "WAL file(s) added" is misleading to the point of uselessness - Mailing list pgsql-hackers

Previous

Next