Re: why doesn't DestroyPartitionDirectory hash_destroy? - Mailing list pgsql-hackers

From Robert Haas
Subject Re: why doesn't DestroyPartitionDirectory hash_destroy?
Date
Msg-id CA+TgmoYAaqrR7z6A2rrS847MvDqCgRPSRs9N2E1uvFagpncUHw@mail.gmail.com
Whole thread Raw
In response to Re: why doesn't DestroyPartitionDirectory hash_destroy?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: why doesn't DestroyPartitionDirectory hash_destroy?
List pgsql-hackers
On Thu, Mar 14, 2019 at 1:16 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Actually, now that I've absorbed a bit more about 898e5e329,
> I don't like very much about it at all.  I think having it
> try to hang onto pointers into the relcache is a completely
> wrongheaded design decision, and the right way for it to work
> is to just copy the PartitionDescs out of the relcache so that
> they're fully owned by the PartitionDirectory.  I don't see
> a CopyPartitionDesc function anywhere (maybe it's named something
> else?) but it doesn't look like it'd be hard to build; most
> of the work is in partition_bounds_copy() which does exist already.

Yeah, we could do that.  I have to admit that I don't necessarily
understand why trying to hang onto pointers into the relcache is a bad
idea.  It is a bit complicated, but the savings in both memory and CPU
time seem worth pursuing.  There are a lot of users who wish we scaled
to a million partitions rather than a hundred, and just copying
everything all over the place all the time won't get us closer to that
goal.

More generally, I think get_relation_info() is becoming an
increasingly nasty piece of work.  It copies more and more stuff
instead of just pointing to it, which is necessary mostly because it
closes the table instead of arranging to do that at the end of query
planning.  If it did the opposite, the refcount held by the planner
would make it unnecessary for the PartitionDirectory to hold one, and
I bet we could also just point to a bunch of the other stuff in this
function rather than copying that stuff, too.  As time goes by,
relcache entries are getting more and more complex, and the optimizer
wants to use more and more data from them for planning purposes, but,
probably partly because of inertia, we're clinging to an old design
where everything has to be copied.  Every time someone gets that
wrong, and it's happened a number of times, we yell at them and tell
them to copy more stuff instead of thinking up a design where stuff
doesn't need to be copied.  I think that's a mistake.  You have
previously disagreed with that position so you probably will now, too,
but I still think it.

> Also, at least so far as the planner's usage is concerned, claiming
> that we're saving something by not copying is completely bogus,
> because if we look into set_relation_partition_info, what do we
> find but a partition_bounds_copy call.  That wouldn't be necessary
> if we could rely on the PartitionDirectory to own the data structure.
> (Maybe it's not necessary today.  But given what a house of cards
> this is, I wouldn't propose ripping it out, just moving it into
> the PartitionDirectory code.)

Ugh, I didn't notice the partition_bounds_copy() call in that
function.  Oops.  Given the foregoing griping, you won't be surprised
to hear that I'd rather just remove the copying step.  However, it
sounds like we can do that no matter whether we stick with my design
or switch to yours, because PartitionDirectory holds a relcache refcnt
then the pointer will be stable, and if we deep-copy the whole data
structure then the pointer will also be stable.  Prior to the commit
at issue, we weren't doing either of those things, so that copy was
needed until very recently.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: why doesn't DestroyPartitionDirectory hash_destroy?
Next
From: Robert Haas
Date:
Subject: Re: why doesn't DestroyPartitionDirectory hash_destroy?