Home > mailing lists

Re: Speed up transaction completion faster after many relations are accessed in a transaction - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Speed up transaction completion faster after many relations are accessed in a transaction
Date	July 24, 2019 20:49:00
Msg-id	32310.1563990540@sss.pgh.pa.us Whole thread Raw
In response to	Re: Speed up transaction completion faster after many relations areaccessed in a transaction (David Rowley <david.rowley@2ndquadrant.com>)
Responses	Re: Speed up transaction completion faster after many relations areaccessed in a transaction (Thomas Munro <thomas.munro@gmail.com>) Re: Speed up transaction completion faster after many relations areaccessed in a transaction (David Rowley <david.rowley@2ndquadrant.com>)
List	pgsql-hackers

Tree view

David Rowley <david.rowley@2ndquadrant.com> writes:
> Here's a more polished version with the debug code removed, complete
> with benchmarks.

A few gripes:

You're measuring the number of locks held at completion of the
transaction, which fails to account for locks transiently taken and
released, so that the actual peak usage will be more.  I think we mostly
only do that for system catalog accesses, so *maybe* the number of extra
locks involved isn't very large, but that's a very shaky assumption.

I don't especially like the fact that this seems to have a hard-wired
(and undocumented) assumption that buckets == entries, ie that the
fillfactor of the table is set at 1.0.  lock.c has no business knowing
that.  Perhaps instead of returning the raw bucket count, you could have
dynahash.c return bucket count times fillfactor, so that the number is in
the same units as the entry count.

This:
        running_avg_locks -= running_avg_locks / 10.0;
        running_avg_locks += numLocksHeld / 10.0;
seems like a weird way to do the calculation.  Personally I'd write
        running_avg_locks += (numLocksHeld - running_avg_locks) / 10.0;
which is more the way I'm used to seeing exponential moving averages
computed --- at least, it seems clearer to me why this will converge
towards the average value of numLocksHeld over time.  It also makes
it clear that it wouldn't be sane to use two different divisors,
whereas your formulation makes it look like maybe they could be
set independently.

Your compiler might not complain that LockReleaseAll's numLocksHeld
is potentially uninitialized, but other people's compilers will.

On the whole, I don't especially like this approach, because of the
confusion between peak lock count and end-of-xact lock count.  That
seems way too likely to cause problems.

            regards, tom lane

pgsql-hackers by date:

From: vignesh C
Date: 24 July 2019, 20:34:24
Subject: Re: POC: Cleaning up orphaned files using undo logs

From: Peter Geoghegan
Date: 24 July 2019, 21:02:13
Subject: Re: GiST VACUUM

Re: Speed up transaction completion faster after many relations are accessed in a transaction - Mailing list pgsql-hackers

Previous

Next