Re: Possible bug in logical replication. - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Possible bug in logical replication.
Date
Msg-id 20180719024527.spsjqg3ttudpfv6s@alvherre.pgsql
Whole thread Raw
In response to Re: Possible bug in logical replication.  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Possible bug in logical replication.  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On 2018-Jul-19, Michael Paquier wrote:

> On Wed, Jul 18, 2018 at 02:30:53PM -0400, Alvaro Herrera wrote:
> > In the immortal words of Julian Bream: "yeah, I didn't like any of
> > that".
> 
> One wikipedia lookup later, I still don't know where this quote comes
> from, but at least I understand who the man is.

https://twitter.com/alvherre/status/1019652397306703873

OK, maybe not that "immortal" after all.

> I may be missing something, but I cannot apply your patch on HEAD so I
> have not tested it.  Anyway, I read through it and the thing does not
> look logically wrong.

Sorry, I forgot to mention this -- it applies to 11.

> > I also moved some assignments from the declaration section to the code
> > section, so that I could attach proper comments to each, to improve
> > clarity of *why* we do those things.
> 
> To be pedantic here, you could move the declarations of startlsn,
> old_resowner and ctx directly inside the PG_TRY block.

Good idea, thanks.

> > I then noticed that we get a XLogRecord from XLogReadRecord, but then
> > fail to do anything with it, so I changed the code to use a bool
> > instead, which I think is clearer.
> 
> Matter of taste perhaps, I was fine with just manipulating the record
> pointer.

Yeah, it works out to the same thing really.  Maybe it's just me being
pedantic and annoyed when I realized that logical decoding does not
operate on the record itself but on the xlogreader struct instead.

TBH what I was actually doing at first was just casting the result of
XLogReadRecord to void and not doing anything with it, until I realized
that the logical decoding call below was important because of
side-effects.  I thought since fast_forward mode was used,
XLogReadRecord would not return anything anyway -- but I added an assert
and realized that was not so.

> > I think the proposed comment before the LogicalDecodingProcessRecord
> > call failed to convey the important ideas, so I rewrote that one also.
> > 
> > There is no struct member called confirmed_flush_lsn anywhere.
> 
> This is referring to the system catalog field in pg_replication_slots.

Yeah, I think that's a bit misleading.  (I very frequently do a tag-jump
on identifiers in comments, and it's uncomfortable that in this case it
jumps to the Docbook source rather than to the struct declaration.)

> >      PG_TRY();
> >      {
> > -        /* restart at slot's confirmed_flush */
> > +        /*
> > +         * Create our decoding context in fast_forward mode, passing start_lsn
> > +         * as Invalid, so that we start processing from confirmed_flush.
> > +         */
> 
> I'd rather mention InvalidXLogRecPtr directly here.  Invalid alone makes
> no real sense.

OK.  I was of two minds about that.

> > +      gotrecord = XLogReadRecord(ctx->reader, startlsn, &errm) != NULL;
> 
> I would put parenthesis for clarity.

Or just put it back as a record pointer.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add SKIP LOCKED to VACUUM and ANALYZE
Next
From: Michael Paquier
Date:
Subject: Re: More consistency for some file-related error message