Re: advance local xmin more aggressively - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: advance local xmin more aggressively
Date
Msg-id 54883DC8.7070508@vmware.com
Whole thread Raw
In response to Re: advance local xmin more aggressively  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: advance local xmin more aggressively  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 12/09/2014 10:35 PM, Robert Haas wrote:
> On Mon, Dec 8, 2014 at 9:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Dec 8, 2014 at 4:56 AM, Heikki Linnakangas
>> <hlinnakangas@vmware.com> wrote:
>>> I don't immediately see the problem either, but I have to say that
>>> grovelling through all the resource owners seems ugly anyway. Resource
>>> owners are not meant to be traversed like that. And there could be a lot of
>>> them, and you have to visit every one of them. That could get expensive if
>>> there are a lot of resource owners.
>>
>> 1. I don't really see why resource owners shouldn't be traversed like
>> that.  They are clearly intended to form a hierarchy, and there's
>> existing code that recurses through the hierarchy from a given level
>> downward.  What's ugly about that?

I can't exactly point my finger on it, but it just feels wrong from a
modularity point of view. Their purpose is to make sure that we don't
leak resources on abort, by allowing easy an "release everything"
operation. It's not designed for finding objects based on some other
criteria.

There is similar double bookkeeping of many other things that are
tracked by resource owners. Heavy-weight locks are tracked by LOCALLOCK
structs, buffer pins in PrivateRefCount array etc. Those things existed
before resource owners were invented, but if we were starting from
scratch, that design would still make sense, as different objects have
different access criteria. fd.c needs to be able to find the
least-recently-used open file, for example, and you need to find the
snapshot with the lowest xmin.

> Upthread, I suggested keeping a tally of the number of snapshots with
> the advertised xmin and recomputing the xmin to advertise only when it
> reaches 0.  This patch doesn't implementation that optimization, but
> it does have code that aborts the traversal of the resource owner
> hierarchy as soon as we see an xmin that will preclude advancing our
> advertised xmin.  Releasing N resource owners could therefore cost
> O(N^2) in the worst case, but note that releasing N resource owners is
> *already* an O(N^2) operation in the worst case, because the list of
> children of a particular parent resource owner is singly linked, and
> thus deleting a resource owner is O(N).  It's been that way for an
> awfully long time without anyone complaining, probably because (a)
> it's not particularly common to have large numbers of cursors open
> simultaneously and (b) even if you do have that, the constant factor
> is pretty low.

I think you're confusing the N and the N above. It's true that deleting
a resource owner is O(M), where the M is the number of children of that
resource owner. It does not follow that releasing N resource owners is
O(N^2), where N is the number of resource owners released. Calling
ResourceOwnerDelete(x) will only visit each resource owner in that tree
once, so it's O(N), where N is the total number of resource owners in
the tree.

I did some testing of the worst case scenario. The attached script first
creates a lot of cursors, then a lot of savepoints, and finally closes
the cursors in FIFO order. When the number of savepoints is high enough,
this actually segfaults with your patch, because you run out of stack
space when recursing the subxact resource owners. That's hardly this
patch's fault, I'm actually surprised it doesn't crash without it,
because we recurse into all resource owners in ResourceOwnerRelease too.
Apparently the subxacts are closed in LIFO order at commit, but there
might be are other cases where you could trigger that. In any case, a
stack-depth check would be nice.

- Heikki


Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: GSSAPI, SSPI - include_realm default
Next
From: Robert Haas
Date:
Subject: Re: logical column ordering