Re: [HACKERS] logical replication busy-waiting on a lock - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [HACKERS] logical replication busy-waiting on a lock
Date
Msg-id 20170613073240.zhoglwbieffehxie@alap3.anarazel.de
Whole thread Raw
In response to Re: [HACKERS] logical replication busy-waiting on a lock  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Responses Re: [HACKERS] logical replication busy-waiting on a lock  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On 2017-06-09 22:28:00 +0200, Petr Jelinek wrote:
> On 09/06/17 03:20, Petr Jelinek wrote:
> > On 09/06/17 01:06, Andres Freund wrote:
> >> Hi,
> >>
> >> On 2017-06-03 04:50:00 +0200, Petr Jelinek wrote:
> >>> One thing I don't like is the GetLastLocalTransactionId() that I had to
> >>> add because we clear the proc->lxid before we get to AtEOXact_Snapshot()
> >>> but I don't have better solution there.
> >>
> >> I dislike that quite a bit - how about we instead just change the
> >> information we keep in exportedSnapshots?  I.e. store a pointer to an
> >> struct ExportedSnapshot
> >> {
> >>     char *exportedAs;
> >>     Snapshot snapshot;
> >> }
> >>
> >> Then we can get rid of that relatively ugly list_length() business too.
> >>
> > 
> > That sounds reasonable, I'll do that.
> > 
> 
> And here it is, seems better (the 0002 is same as before).

I spent a chunk of time with this.  One surprising, but easy to fix issue was
that this patch had the exported file name as:

>      /*
> +     * Generate file path for the snapshot.  We start numbering of snapshots
> +     * inside the transaction from 1.
> +     */
> +    snprintf(path, sizeof(path), SNAPSHOT_EXPORT_DIR "/%08X-%d",
> +             topXid, list_length(exportedSnapshots) + 1);
> +

I.e. just as before, unlike the previous version of the patch which used
virtualxids.  Given that we don't require topXid to be set, this seems
to largely break the patch.

Secondly, because of

>      /*
> -     * This will assign a transaction ID if we do not yet have one.
> +     * Get our transaction ID if there is one, to include in the snapshot.
>       */
> -    topXid = GetTopTransactionId();
> +    topXid = GetTopTransactionIdIfAny();
>  

which is perfectly correct on its face, we actually added a substantial
feature: Previously exporting snapshots was only possible on primaries,
now it's also possible on standbys.  The only thing that made that error
out before was the check during xid assignment.  Because snapshots work
somewhat differntly on standbys, I spent a good chunk of time staring at
code trying to see whether it'd break anything.  Neither code-reading
nor testing seems to suggest any problems.   Were we earlier in the
release cycle I'd be perfectly fine with an accidental feature, but I'm
wondering a bit whether we should just make it error out at this point,
because it'd be a new feature.  I'm -0.5 on that, but I think it should
be raised.

- Andres



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: [HACKERS] Refreshing subscription relation state inside a transaction block
Next
From: Michael Paquier
Date:
Subject: [HACKERS] Detection of IPC::Run presence in SSL TAP tests