Re: Speed up transaction completion faster after many relations are accessed in a transaction - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Speed up transaction completion faster after many relations are accessed in a transaction
Date
Msg-id 32310.1563990540@sss.pgh.pa.us
Whole thread Raw
In response to Re: Speed up transaction completion faster after many relations areaccessed in a transaction  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Speed up transaction completion faster after many relations areaccessed in a transaction  (Thomas Munro <thomas.munro@gmail.com>)
Re: Speed up transaction completion faster after many relations areaccessed in a transaction  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
David Rowley <david.rowley@2ndquadrant.com> writes:
> Here's a more polished version with the debug code removed, complete
> with benchmarks.

A few gripes:

You're measuring the number of locks held at completion of the
transaction, which fails to account for locks transiently taken and
released, so that the actual peak usage will be more.  I think we mostly
only do that for system catalog accesses, so *maybe* the number of extra
locks involved isn't very large, but that's a very shaky assumption.

I don't especially like the fact that this seems to have a hard-wired
(and undocumented) assumption that buckets == entries, ie that the
fillfactor of the table is set at 1.0.  lock.c has no business knowing
that.  Perhaps instead of returning the raw bucket count, you could have
dynahash.c return bucket count times fillfactor, so that the number is in
the same units as the entry count.

This:
        running_avg_locks -= running_avg_locks / 10.0;
        running_avg_locks += numLocksHeld / 10.0;
seems like a weird way to do the calculation.  Personally I'd write
        running_avg_locks += (numLocksHeld - running_avg_locks) / 10.0;
which is more the way I'm used to seeing exponential moving averages
computed --- at least, it seems clearer to me why this will converge
towards the average value of numLocksHeld over time.  It also makes
it clear that it wouldn't be sane to use two different divisors,
whereas your formulation makes it look like maybe they could be
set independently.

Your compiler might not complain that LockReleaseAll's numLocksHeld
is potentially uninitialized, but other people's compilers will.


On the whole, I don't especially like this approach, because of the
confusion between peak lock count and end-of-xact lock count.  That
seems way too likely to cause problems.

            regards, tom lane



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs
Next
From: Peter Geoghegan
Date:
Subject: Re: GiST VACUUM