Re: Attempt to consolidate reading of XLOG page - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: Attempt to consolidate reading of XLOG page
Date
Msg-id 88183.1574261429@antos
Whole thread Raw
In response to Re: Attempt to consolidate reading of XLOG page  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Attempt to consolidate reading of XLOG page  (Michael Paquier <michael@paquier.xyz>)
Re: Attempt to consolidate reading of XLOG page  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> On 2019-Nov-12, Antonin Houska wrote:
> 
> > ok, the next version uses explicit lseek(). Maybe the fact that XLOG is mostly
> > read sequentially (i.e. without frequent seeks) is the reason pread() has't
> > been adopted so far.
> 
> I don't quite understand why you backed off from switching to pread.  It
> seemed a good change to me.

I agreed with Michael that it makes comparison of the old and new code more
difficult, and I also thought that his arguments about performance might be
worthwhile because WAL reading is mostly sequential and does not require many
seeks. However things appear to be more complex, see below.

> Here's a few edits on top of your latest.

> ...

I agree with your renamings.

> Change xlr_seg to be a struct rather than pointer to struct.  It seems a
> bit dangerous to me to return a pointer that we don't know is going to
> be valid at raise-error time.  Struct assignment works fine for the
> purpose.

ok

> I would only like to switch this back to pg_pread() (from seek/read) and
> I'd be happy to commit this.

I realized that, starting from commit 709d003fbd98b975a4fbcb4c5750fa6efaf9ad87
we use the WALOpenSegment.ws_off field incorrectly in
walsender.c:XLogRead(). In that commit we used this field to replace
XLogReaderState.readOff:

@@ -156,10 +165,9 @@ struct XLogReaderState
        char       *readBuf;
        uint32          readLen;
 
-       /* last read segment, segment offset, TLI for data currently in readBuf */
-       XLogSegNo       readSegNo;
-       uint32          readOff;
-       TimeLineID      readPageTLI;
+       /* last read XLOG position for data currently in readBuf */
+       WALSegmentContext segcxt;
+       WALOpenSegment seg;
 
        /*
         * beginning of prior page read, and its TLI.  Doesn't necessarily

Thus we cannot use it in XLogRead() to track the current position in the
segment file. Although walsender.c:XLogRead() misses this point, it's not
broken because walsender.c does not use XLogReaderState at all.

So if explicit lseek() should be used, another field should be added to
WALOpenSegment. I failed to do so when removing the pg_pread() call from the
patch, and that was the reason for the problem reported here:

https://www.postgresql.org/message-id/20191117042221.GA16537%40alvherre.pgsql
https://www.postgresql.org/message-id/20191120083802.GB47145@paquier.xyz

Thus the use of pg_pread() makes the code quite a bit simpler, so I
re-introduced it. If you decide that an explicit lseek() should be used yet,
just let me know.

> What is logical_read_local_xlog_page all about?  Seems useless.  Let's
> get rid of it.

It seems so. Should I post a patch for that?

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Next
From: Dilip Kumar
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions