Re: Improve WALRead() to suck data directly from WAL buffers when possible - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Improve WALRead() to suck data directly from WAL buffers when possible
Date
Msg-id CALj2ACU3ZYzjOv4vZTR+LFk5PL4ndUnbLS6E1vG2dhDBjQGy2A@mail.gmail.com
Whole thread Raw
In response to Re: Improve WALRead() to suck data directly from WAL buffers when possible  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
On Tue, Mar 7, 2023 at 11:14 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Tue, Mar 07, 2023 at 12:39:13PM +0530, Bharath Rupireddy wrote:
> > On Tue, Mar 7, 2023 at 3:30 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> >> Is it possible to memcpy more than a page at a time?
> >
> > It would complicate things a lot there; the logic to figure out the
> > last page bytes that may or may not fit in the whole page gets
> > complicated. Also, the logic to verify each page's header gets
> > complicated. We might lose out if we memcpy all the pages at once and
> > start verifying each page's header in another loop.
>
> Doesn't the complicated logic you describe already exist to some extent in
> the patch?  You are copying a page at a time, which involves calculating
> various addresses and byte counts.

Okay here I am with the v10 patch set attached that avoids multiple
memcpy calls which must benefit the callers who want to read more than
1 WAL buffer page (streaming replication WAL sender for instance).

> >> +    elog(DEBUG1, "read %zu bytes out of %zu bytes from WAL buffers for given LSN %X/%X, Timeline ID %u",
> >> +         *read_bytes, count, LSN_FORMAT_ARGS(startptr), tli);
> >>
> >> I definitely don't think we should put an elog() in this code path.
> >> Perhaps this should be guarded behind WAL_DEBUG.
> >
> > Placing it behind WAL_DEBUG doesn't help users/developers. My
> > intention was to let users know that the WAL read hit the buffers,
> > it'll help them report if any issue occurs and also help developers to
> > debug that issue.
>
> I still think an elog() is mighty expensive for this code path, even when
> it doesn't actually produce any messages.  And when it does, I think it has
> the potential to be incredibly noisy.

Well, my motive was to have a way for the user to know WAL buffer hits
and misses to report any found issues. However, I have a plan later to
add WAL buffer stats (hits/misses). I understand that even if someone
enables DEBUG1, this message can bloat server log files and make
recovery slower, especially on a standby. Hence, I agree to keep these
logs behind the WAL_DEBUG macro like others and did so in the attached
v10 patch set.

Please review the attached v10 patch set further.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Jim Jones
Date:
Subject: Re: [PATCH] Add CANONICAL option to xmlserialize
Next
From: Amit Kapila
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher