Re: Improve WALRead() to suck data directly from WAL buffers when possible - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Improve WALRead() to suck data directly from WAL buffers when possible
Date
Msg-id CALj2ACVS4NFxcLf_dupBVtKT0Z7J4tyUVqMxbH36_P69CzUf+A@mail.gmail.com
Whole thread Raw
In response to Re: Improve WALRead() to suck data directly from WAL buffers when possible  (Nitin Jadhav <nitinjadhavpostgres@gmail.com>)
Responses Re: Improve WALRead() to suck data directly from WAL buffers when possible  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Sun, Mar 12, 2023 at 12:52 AM Nitin Jadhav
<nitinjadhavpostgres@gmail.com> wrote:
>
> I went through the v8 patch.

Thanks for looking at it. Please post the responses in-line, not above
the entire previous message for better readability.

> Following are my thoughts to improve the
> WAL buffer hit ratio.

Note that the motive of this patch is to read WAL from WAL buffers
*when possible* without affecting concurrent WAL writers.

> Currently the no-longer-needed WAL data present in WAL buffers gets
> cleared in XLogBackgroundFlush() which is called based on the
> wal_writer_delay config setting. Once the data is flushed to the disk,
> it is treated as no-longer-needed and it will be cleared as soon as
> possible based on some config settings.

Being opportunistic in pre-initializing as many possible WAL buffer
pages as is there for a purpose. There's an illuminating comment [1],
so that's done for a purpose, so removing it fully is a no-go IMO. For
instance, it'll make WAL buffer pages available for concurrent writers
so there will be less work for writers in GetXLogBuffer. I'm sure
removing the opportunistic pre-initialization of the WAL buffer pages
will hurt performance in a highly concurrent-write workload.

    /*
     * Great, done. To take some work off the critical path, try to initialize
     * as many of the no-longer-needed WAL buffers for future use as we can.
     */
    AdvanceXLInsertBuffer(InvalidXLogRecPtr, insertTLI, true);

> Second, In WALRead(), we try to read the data from disk whenever we
> don't find the data from WAL buffers. We don't store this data in the
> WAL buffer. We just read the data, use it and leave it. If we store
> this data to the WAL buffer, then we may avoid a few disk reads.

Again this is going to hurt concurrent writers. Note that wal_buffers
aren't used as full cache per-se, there'll be multiple writers to it,
*when possible* readers will try to read from it without hurting
writers.

> The patch attached takes care of this.

Please post the new proposal as a text file (not a .patch file) or as
a plain text in the email itself if the change is small or attach all
the patches if the patch is over-and-above the proposed patches.
Attaching a single over-and-above patch will make CFBot unhappy and
will force authors to repost the original patches. Typically, we
follow this. Having said, I have some review comments to fix on
v8-0001, so, I'll be sending out v9 patch-set soon.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Implement IF NOT EXISTS for CREATE PUBLICATION AND CREATE SUBSCRIPTION
Next
From: Stéphane Tachoires
Date:
Subject: Re: [Proposal] Allow pg_dump to include all child tables with the root table