Re: Improving connection scalability: GetSnapshotData() - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Improving connection scalability: GetSnapshotData()
Date
Msg-id 20200908041114.5woxns33iv765fgh@alap3.anarazel.de
Whole thread Raw
In response to Re: Improving connection scalability: GetSnapshotData()  (Ian Barwick <ian.barwick@2ndquadrant.com>)
Responses Re: Improving connection scalability: GetSnapshotData()  (Ian Barwick <ian.barwick@2ndquadrant.com>)
Re: Improving connection scalability: GetSnapshotData()  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Hi,

On 2020-09-08 13:03:01 +0900, Ian Barwick wrote:
> On 2020/09/03 17:18, Michael Paquier wrote:
> > On Sun, Aug 16, 2020 at 02:26:57PM -0700, Andres Freund wrote:
> > > So we get some builfarm results while thinking about this.
> > 
> > Andres, there is an entry in the CF for this thread:
> > https://commitfest.postgresql.org/29/2500/
> > 
> > A lot of work has been committed with 623a9ba, 73487a6, 5788e25, etc.
> 
> I haven't seen it mentioned here, so apologies if I've overlooked
> something, but as of 623a9ba queries on standbys seem somewhat
> broken.
> 
> Specifically, I maintain some code which does something like this:
> 
> - connects to a standby
> - checks a particular row does not exist on the standby
> - connects to the primary
> - writes a row in the primary
> - polls the standby (using the same connection as above)
>   to verify the row arrives on the standby
> 
> As of recent HEAD it never sees the row arrive on the standby, even
> though it is verifiably there.

Ugh, that's not good.


> I've traced this back to 623a9ba, which relies on "xactCompletionCount"
> being incremented to determine whether the snapshot can be reused,
> but that never happens on a standby, meaning this test in
> GetSnapshotDataReuse():
> 
>     if (curXactCompletionCount != snapshot->snapXactCompletionCount)
>         return false;
> 
> will never return false, and the snapshot's xmin/xmax never get advanced.
> Which means the session on the standby is not able to see rows on the
> standby added after the session was started.
> 
> It's simple enough to prevent that being an issue by just never calling
> GetSnapshotDataReuse() if the snapshot was taken during recovery
> (though obviously that means any performance benefits won't be available
> on standbys).

Yea, that doesn't sound great. Nor is the additional branch welcome.


> I wonder if it's possible to increment "xactCompletionCount"
> during replay along these lines:
> 
>     *** a/src/backend/access/transam/xact.c
>     --- b/src/backend/access/transam/xact.c
>     *************** xact_redo_commit(xl_xact_parsed_commit *
>     *** 5915,5920 ****
>     --- 5915,5924 ----
>              */
>             if (XactCompletionApplyFeedback(parsed->xinfo))
>                     XLogRequestWalReceiverReply();
>     +
>     +       LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
>     +       ShmemVariableCache->xactCompletionCount++;
>     +       LWLockRelease(ProcArrayLock);
>       }
> 
> which seems to work (though quite possibly I've overlooked something I don't
> know that I don't know about and it will all break horribly somewhere,
> etc. etc.).

We'd also need the same in a few more places. Probably worth looking at
the list where we increment it on the primary (particularly we need to
also increment it for aborts, and 2pc commit/aborts).

At first I was very confused as to why none of the existing tests have
found this significant issue. But after thinking about it for a minute
that's because they all use psql, and largely separate psql invocations
for each query :(. Which means that there's no cached snapshot around...

Do you want to try to write a patch?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Remove page-read callback from XLogReaderState.
Next
From: Ian Barwick
Date:
Subject: Re: Improving connection scalability: GetSnapshotData()