Pre-allocating WAL files - Mailing list pgsql-hackers

From Andres Freund
Subject Pre-allocating WAL files
Date
Msg-id 20201225200953.jjkrytlrzojbndh5@alap3.anarazel.de
Whole thread Raw
Responses Re: Pre-allocating WAL files
List pgsql-hackers
Hi,

When running write heavy transactional workloads I've many times
observed that one needs to run the benchmarks for quite a while till
they get to their steady state performance. The most significant reason
for that is that initially WAL files will not get recycled, but need to
be freshly initialized. That's 16MB of writes that need to synchronously
finish before a small write transaction can even start to be written
out...

I think there's two useful things we could do:

1) Add pg_wal_preallocate(uint64 bytes) that ensures (bytes +
   segment_size - 1) / segment_size WAL segments exist from the current
   point in the WAL. Perhaps with the number of bytes defaulting to
   min_wal_size if not explicitly specified?

2) Have checkpointer (we want walwriter to run with low latency to flush
   out async commits etc) occasionally check if WAL files need to be
   pre-allocated.

   Checkpointer already tracks the amount of WAL that's expected to be
   generated till the end of the checkpoint, so it seems like it's a
   pretty good candidate to do so.

   To keep checkpointer pre-allocating when idle we could signal it
   whenever a record has crossed a segment boundary.


With a plain pgbench run I see a 2.5x reduction in throughput in the
periods where we initialize WAL files.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Temporary tables versus wraparound... again
Next
From: Nikita Glukhov
Date:
Subject: Re: SQL/JSON: functions