Re: Smoothing the subtrans performance catastrophe - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Smoothing the subtrans performance catastrophe
Date
Msg-id CANbhV-EZwda3YVynHuNeY8vW6JTn58cM+Z1CMW9dyS-WDx6-nw@mail.gmail.com
Whole thread Raw
In response to Re: Smoothing the subtrans performance catastrophe  (Andres Freund <andres@anarazel.de>)
Responses Re: Smoothing the subtrans performance catastrophe
List pgsql-hackers
On Wed, 3 Aug 2022 at 20:18, Andres Freund <andres@anarazel.de> wrote:

> On 2022-08-01 17:42:49 +0100, Simon Riggs wrote:
> > The reason for the slowdown is clear: when we overflow we check every
> > xid against subtrans, producing a large stream of lookups. Some
> > previous hackers have tried to speed up subtrans - this patch takes a
> > different approach: remove as many subtrans lookups as possible. (So
> > is not competing with those other solutions).
> >
> > Attached patch improves on the situation, as also shown in the attached diagram.
>
> I think we should consider redesigning subtrans more substantially - even with
> the changes you propose here, there's still plenty ways to hit really bad
> performance. And there's only so much we can do about that without more
> fundamental design changes.

I completely agree - you will be glad to hear that I've been working
on a redesign of the subtrans module.

But we should be clear that redesigning subtrans has nothing to do
with this patch; they are separate ideas and this patch relates to
XidInMVCCSnapshot(), an important caller of subtrans.

I will post my patch, when complete, in a different thread.

> One way to fix a lot of the issues around pg_subtrans would be remove the
> pg_subtrans SLRU and replace it with a purely in-memory hashtable. IMO there's
> really no good reason to use an SLRU for it (anymore).
>
> In contrast to e.g. clog or multixact we don't need to access a lot of old
> entries, we don't need persistency etc. Nor is it a good use of memory and IO
> to have loads of pg_subtrans pages that don't point anywhere, because the xid
> is just a "normal" xid.
>
> While we can't put a useful hard cap on the number of potential subtrans
> entries (we can only throw subxid->parent mappings away once no existing
> snapshot might need them), saying that there can't be more subxids "considered
> running" at a time than can fit in memory doesn't seem like a particularly
> problematic restriction.

I do agree that sometimes it is easier to impose restrictions than to
try to provide unbounded resources.

Having said that, I can't see an easy way of making that work well in
practice for this case. Making write transactions just suddenly stop
working at some point doesn't sound like it would be good for
availability, especially when it happens sporadically and
unpredictably as that would, whenever long running transactions appear
alongside users of subtransactions.

> So, why don't we use a dshash table with some amount of statically allocated
> memory for the mapping? In common cases that will *reduce* memory usage
> (because we don't need to reserve space for [as many] subxids in snapshots /
> procarray anymore) and IO (no mostly-zeroes pg_subtrans).

I considered this and have ruled it out, but as I said above, we can
discuss that on a different thread.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: making relfilenodes 56 bits
Next
From: Alvaro Herrera
Date:
Subject: Re: enable/disable broken for statement triggers on partitioned tables