Re: hyrax vs. RelationBuildPartitionDesc - Mailing list pgsql-hackers

From Tom Lane
Subject Re: hyrax vs. RelationBuildPartitionDesc
Date
Msg-id 23284.1555263536@sss.pgh.pa.us
Whole thread Raw
In response to Re: hyrax vs. RelationBuildPartitionDesc  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: hyrax vs. RelationBuildPartitionDesc
List pgsql-hackers
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> On 2019/04/10 15:42, Michael Paquier wrote:
>> It seems to me that Tom's argument to push in the way relcache
>> information is handled by copying its contents sounds sensible to me.
>> That's not perfect, but it is consistent with what exists (note
>> PublicationActions for a rather-still-not-much-complex example of
>> structure which gets copied from the relcache).

> I gave my vote for direct access of unchangeable relcache substructures
> (TupleDesc, PartitionDesc, etc.), because they're accessed during the
> planning of every user query and fairly expensive to copy compared to list
> of indexes or PublicationActions that you're citing.  To further my point
> a bit, I wonder why PublicationActions needs to be copied out of relcache
> at all?  Looking at its usage in execReplication.c, it seems we can do
> fine without copying, because we are holding both a lock and a relcache
> reference on the replication target relation during the entirety of
> apply_handle_insert(), which means that information can't change under us,
> neither logically nor physically.

So the point here is that that reasoning is faulty.  You *cannot* assume,
no matter how strong a lock or how many pins you hold, that a relcache
entry will not get rebuilt underneath you.  Cache flushes happen
regardless.  And unless relcache.c takes special measures to prevent it,
a rebuild will result in moving subsidiary data structures and thereby
breaking any pointers you may have pointing into those data structures.

For certain subsidiary structures such as the relation tupdesc,
we do take such special measures: that's what the "keep_xxx" dance in
RelationClearRelation is.  However, that's expensive, both in cycles
and maintenance effort: it requires having code that can decide equality
of the subsidiary data structures, which we might well have no other use
for, and which we certainly don't have strong tests for correctness of.
It's also very error-prone for callers, because there isn't any good way
to cross-check that code using a long-lived pointer to a subsidiary
structure is holding a lock that's strong enough to guarantee non-mutation
of that structure, or even that relcache.c provides any such guarantee
at all.  (If our periodic attempts to reduce lock strength for assorted
DDL operations don't scare the pants off you in this connection, you have
not thought hard enough about it.)  So I think that even though we've
largely gotten away with this approach so far, it's also a half-baked
kluge that we should be looking to get rid of, not extend to yet more
cases.

To my mind there are only two trustworthy solutions to the problem of
wanting time-extended usage of a relcache subsidiary data structure: one
is to copy it, and the other is to reference-count it.  I think that going
over to a reference-count-based approach for many of these structures
might well be something we should do in future, maybe even the very near
future.  In the mean time, though, I'm not really satisfied with inserting
half-baked kluges, especially not ones that are different from our other
half-baked kluges for similar purposes.  I think that's a path to creating
hard-to-reproduce bugs.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Should the docs have a warning about pg_stat_reset()?
Next
From: Laurenz Albe
Date:
Subject: Re: Identity columns should own only one sequence