Re: hyrax vs. RelationBuildPartitionDesc - Mailing list pgsql-hackers

From Robert Haas
Subject Re: hyrax vs. RelationBuildPartitionDesc
Date
Msg-id CA+TgmoYg2NY3KTUFiuaCxDkW8iXD3o3H3wdXs6m-o-dGiYHD_Q@mail.gmail.com
Whole thread Raw
In response to Re: hyrax vs. RelationBuildPartitionDesc  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: hyrax vs. RelationBuildPartitionDesc
Re: hyrax vs. RelationBuildPartitionDesc
List pgsql-hackers
On Thu, Mar 14, 2019 at 12:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hmm, I wonder why not.  I suppose the answer is that
> the leak is worse in HEAD than before, but how come?

I'd like to know that, too, but right now I don't.

> I followed your reference to 898e5e329, and I've got to say that
> the hunk it adds in relcache.c looks fishy as can be.  The argument
> that the rd_pdcxt "will get cleaned up eventually" reads to me like
> "this leaks memory like a sieve", especially in the repeated-rebuild
> scenario which is exactly what CLOBBER_CACHE_ALWAYS would provoke.
> Probably the only thing that keeps it from being effectively a
> session-lifespan leak is that CCA will generally result in relcache
> entries being flushed entirely as soon as their refcount goes to 0.
> Still, because of that, I wouldn't think it'd move the needle very
> much on a CCA animal; so my guess is that there's something else.

I'm a little uncertain of that logic, too, but keep in mind that if
keep_partdesc is true, we're just going to throw away the newly-build
data and keep the old stuff.  So I think that in order for this to
actually be a problem, you would have to have other sessions that
repeatedly alter the partitiondesc via ATTACH PARTITION, and at the
same time, the current session would have to keep on reading it.  But
you also have to never add a partition while the local session is
between queries, because if you do, rebuild will be false and we'll
blow away the old entry in its entirety and any old contexts that are
hanging off of it will be destroyed as well.  So it seems a little
ticklish to me to figure out a realistic scenario in which we actually
leak enough memory to matter here, but maybe you have an idea of which
I have not thought.

We could certainly do better - just refcount each PartitionDesc.  Have
the relcache entry hold a refcount on the PartitionDesc and have a
PartitionDirectory hold a refcount on the PartitionDesc; then, any
time the refcount goes to 0, you can immediately destroy the
PartitionDesc.  Or, simpler, arrange things so that when the
refcache's refcount goes to 0, any old PartitionDescs that are still
hanging off of the latest one get destroyed then, not later.  It's
just a question of whether it's really worth the code, or whether
we're fixing imaginary bugs by adding code that might have real ones.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: why doesn't DestroyPartitionDirectory hash_destroy?
Next
From: Robert Haas
Date:
Subject: Re: why doesn't DestroyPartitionDirectory hash_destroy?