Re: [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables) - Mailing list pgsql-hackers

From Claudio Freire
Subject Re: [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)
Date
Msg-id CAGTBQpY+A=DeYHRh6ZfekdNKX=KZsMjQjStLfqEyZLNEL622=w@mail.gmail.com
Whole thread Raw
In response to Re: [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Tue, Aug 23, 2016 at 9:12 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> On 08/24/2016 12:38 AM, Claudio Freire wrote:
>>
>> On Tue, Aug 23, 2016 at 7:25 PM, Tomas Vondra
>> <tomas.vondra@2ndquadrant.com> wrote:
>>>>>
>>>>> Could someone please explain how the unlogged tables are supposed to
>>>>> fix
>>>>> the
>>>>> catalog bloat problem, as stated in the initial patch submission? We'd
>>>>> still
>>>>> need to insert/delete the catalog rows when creating/dropping the
>>>>> temporary
>>>>> tables, causing the bloat. Or is there something I'm missing?
>>>>
>>>>
>>>>
>>>> Wouldn't more aggressive vacuuming of catalog tables fix the bloat?
>>>>
>>>> Perhaps reserving a worker or N to run only on catalog schemas?
>>>>
>>>> That'd be far simpler.
>>>
>>>
>>>
>>> Maybe, although IIRC the issues with catalog bloat were due to a
>>> combination
>>> of long queries and many temporary tables being created/dropped. In that
>>> case simply ramping up autovacuum (or even having a dedicated workers for
>>> catalogs) would not realy help due to the xmin horizon being blocked by
>>> the
>>> long-running queries.
>>>
>>> Maybe it's entirely crazy idea due to the wine I drank at the dinner, but
>>> couldn't we vacuum the temporary table records differently? For example,
>>> couldn't we just consider them removable as soon as the backend that owns
>>> them disappears?
>>
>>
>> Or perhaps go all the way and generalize that to rows that never
>> become visible outside their parent transaction.
>>
>> As in, delete of rows created by the deleting transaction could clean
>> up, carefully to avoid voiding indexes and all that, but more
>> aggressively than regular deletes.
>>
>
> Maybe, but I wouldn't be surprised if such generalization would be an order
> of magnitude more complicated - and even the vacuuming changes I mentioned
> are undoubtedly a fair amount of work.

After looking at it from a birdseye view, I agree it's conceptually
complex (reading HeapTupleSatisfiesSelf already makes one dizzy).

But other than that, the implementation seems rather simple. It seems
to me, if one figures out that it is safe to do so (a-priori, xmin not
committed, xmax is current transaction), it would simply be a matter
of chasing the HOT chain root, setting all LP except the first to
LP_UNUSED and the first one to LP_DEAD.

Of course I may be missing a ton of stuff.

> Sadly, I don't see how this might fix the other issues mentioned in this
> thread (e.g. impossibility to create temp tables on standbys),

No it doesn't :(



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Patch: initdb: "'" for QUOTE_PATH (non-windows)
Next
From: neha khatri
Date:
Subject: Re: New SQL counter statistics view (pg_stat_sql)