Re: Improving connection scalability: GetSnapshotData() - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Improving connection scalability: GetSnapshotData()
Date
Msg-id 20200407194327.6u67yoqe3n4hdwmu@alap3.anarazel.de
Whole thread Raw
In response to Re: Improving connection scalability: GetSnapshotData()  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2020-04-07 15:03:46 -0400, Robert Haas wrote:
> On Tue, Apr 7, 2020 at 1:51 PM Andres Freund <andres@anarazel.de> wrote:
> > > ComputedHorizons seems like a fairly generic name. I think there's
> > > some relationship between InvisibleToEveryoneState and
> > > ComputedHorizons that should be brought out more clearly by the naming
> > > and the comments.
> >
> > I don't like the naming of ComputedHorizons, ComputeTransactionHorizons
> > much... But I find it hard to come up with something that's meaningfully
> > better.
> 
> It would help to stick XID in there, like ComputedXIDHorizons. What I
> find really baffling is that you seem to have two structures in the
> same file that have essentially the same purpose, but the second one
> (ComputedHorizons) has a lot more stuff in it. I can't understand why.

ComputedHorizons are the various "accurate" horizons computed by
ComputeTransactionHorizons(). That's used to determine a horizon for
vacuuming (via GetOldestVisibleTransactionId()) and other similar use
cases.

The various InvisibleToEveryoneState variables contain the boundary
based horizons, and are updated / initially filled by
GetSnapshotData(). When the a tested value falls between the boundaries,
we update the approximate boundaries using
ComputeTransactionHorizons(). That briefly makes the boundaries in
the InvisibleToEveryoneState accurate - but future GetSnapshotData()
calls will increase the definitely_needed_bound (if transactions
committed since).

The ComputedHorizons fields could instead just be pointer based
arguments to ComputeTransactionHorizons(), but that seems clearly
worse.

I'll change ComputedHorizons's comment to say that it's the result of
ComputeTransactionHorizons(), not the "state".


> > What's the inconsistency? The dropped replication_ vs dropped FullXid
> > postfix?
> 
> Yeah, just having the member names be randomly different between the
> structs. Really harms greppability.

The long names make it hard to keep line lengths in control, in
particular when also involving the long macro names for TransactionId /
FullTransactionId comparators...


> > > Generally, heap_prune_satisfies_vacuum looks pretty good. The
> > > limited_oldest_committed naming is confusing, but the comments make it
> > > a lot clearer.
> >
> > I didn't like _committed much either. But couldn't come up with
> > something short. _relied_upon?
> 
> oldSnapshotLimitUsed or old_snapshot_limit_used, like currentCommandIdUsed?

Will go for old_snapshot_limit_used, and rename the other variables to
old_snapshot_limit_ts, old_snapshot_limit_xmin.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Dmitry Dolgov
Date:
Subject: Re: Index Skip Scan
Next
From: Dave Cramer
Date:
Subject: Re: Binary support for pgoutput plugin