Re: advance local xmin more aggressively - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: advance local xmin more aggressively
Date
Msg-id 54888970.5020807@vmware.com
Whole thread Raw
In response to Re: advance local xmin more aggressively  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: advance local xmin more aggressively  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 12/10/2014 06:56 PM, Robert Haas wrote:
> On Wed, Dec 10, 2014 at 9:49 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I guess this bears some further thought.  I certainly don't like the
>> fact that this makes the whole system crap out at a lower number of
>> subtransactions than presently.  The actual performance numbers don't
>> bother me very much; I'm comfortable with the possibility that closing
>> a cursor will be some modest percentage slower if you've got thousands
>> of active savepoints.
>
> Here's a new version with two changes:
>
> 1. I changed the traversal of the resource owner tree to iterate
> instead of recurse.  It now does a depth-first, pre-order traversal of
> the tree; when we reach the last child of a node, we follow its parent
> pointer to get back to where we were.  That way, we don't need to keep
> anything on the stack.  That fixed the crash at 100k cursors, but it
> was still 4x slower.

Clever. Could we use that method in ResourceOwnerReleaseInternal and 
ResourceOwnerDelete, too? Might be best to have a 
ResourceOwnerWalk(resowner, callback) function for walking all resource 
owners in a tree, instead of one for walking the snapshots in them.

> 2. Instead of traversing the tree until we find an xmin equal to the
> one we're currently advertising, the code now traverses the entire
> tree each time it runs. However, it also keeps a record of how many
> times the oldest xmin occurred in the tree, which is decremented each
> time we unregister a snapshot with that xmin; the traversal doesn't
> run again until that count reaches 0.  That fixed the performance
> regression on your test case.
>
> With a million subtransactions:
>
> master 34.464s 33.742s 34.317s
> advance-xmin 34.516s 34.069s 34.196s

Well, you can still get the slowness back by running other stuff in the 
background. I admit that it's a very obscure case, probably fine in 
practice. I would still feel better if snapmgr.c did its own 
bookkeeping, from an aesthetic point of view. In a heap, or even just a 
linked list, if the performance characteristics of that is acceptable.

It occurs to me that the pairing heap I just posted in another thread 
(http://www.postgresql.org/message-id/54886BB8.9040000@vmware.com) would 
be a good fit for this. It's extremely cheap to insert to and to find 
the minimum item (O(1)), and the delete operation is O(log N), 
amortized. I didn't implement a delete operation, for removing a 
particular node, I only did delete-min, but it's basically the same 
code. Using a pairing heap for this might be overkill, but if we have 
that implementation anyway, the code in snapmgr.c to use it would be 
very simple, so I see little reason not to. It might even be simpler 
than your patch, because you wouldn't need to have the heuristics on 
whether to attempt updating the xmin; it would be cheap enough to always 
try it.

- Heikki




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: logical column ordering
Next
From: Robert Haas
Date:
Subject: Re: On partitioning