Re: Improving connection scalability: GetSnapshotData() - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Improving connection scalability: GetSnapshotData()
Date
Msg-id 20200408124318.amzvnpsqhy747wqp@alap3.anarazel.de
Whole thread Raw
In response to Re: Improving connection scalability: GetSnapshotData()  (Andres Freund <andres@anarazel.de>)
Responses Re: Improving connection scalability: GetSnapshotData()
List pgsql-hackers
Hi

On 2020-04-07 05:15:03 -0700, Andres Freund wrote:
> SEE BELOW: What, and what not, to do for v13.
>
> [ description of changes ]
> 
> I think this is pretty close to being committable.
> 
> But: This patch came in very late for v13, and it took me much longer to
> polish it up than I had hoped (partially distraction due to various bugs
> I found (in particular snapshot_too_old), partially covid19, partially
> "hell if I know"). The patchset touches core parts of the system. While
> both Thomas and David have done some review, they haven't for the latest
> version (mea culpa).
> 
> In many other instances I would say that the above suggests slipping to
> v14, given the timing.
> 
> The main reason I am considering pushing is that I think this patcheset
> addresses one of the most common critiques of postgres, as well as very
> common, hard to fix, real-world production issues. GetSnapshotData() has
> been a major bottleneck for about as long as I have been using postgres,
> and this addresses that to a significant degree.
> 
> A second reason I am considering it is that, in my opinion, the changes
> are not all that complicated and not even that large. At least not for a
> change to a problem that we've long tried to improve.
> 
> 
> Obviously we all have a tendency to think our own work is important, and
> that we deserve a bit more leeway than others. So take the above with a
> grain of salt.

I tried hard, but came up short. It's 5 AM, and I am still finding
comments that aren't quite right. For a while I thought I'd be pushing a
few hours ...  And even if it were ready now: This is too large a patch
to push this tired (but damn, I'd love to).

Unfortunately adressing Robert's comments made me realize I didn't like
some of my own naming. In particular I started to dislike
InvisibleToEveryone, and some of the procarray.c variables around
"visible".  After trying about half a dozen schemes I think I found
something that makes some sense, although I am still not perfectly
happy.

I think the attached set of patches address most of Robert's review
comments, minus a few cases minor quibbles where I thought he was wrong
(fundamentally wrong of course). There are no *Copy fields in PGPROC
anymore, there's a lot more comments above PROC_HDR (not duplicated
elsewhere). I've reduced the interspersed changes to GetSnapshotData()
so those can be done separately.

There's also somewhat meaningful commit messages now. But
    snapshot scalability: Move in-progress xids to ProcGlobal->xids[].
needs to be expanded to mention the changed locking requirements.


Realistically it still 2-3 hours of proof-reading.


This makes me sad :(

Attachment

pgsql-hackers by date:

Previous
From: David Steele
Date:
Subject: Re: [patch]socket_timeout in interfaces/libpq
Next
From: David Steele
Date:
Subject: Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove pgbench "progress" testpending solution of its timing is (fwd)