Re: hyrax vs. RelationBuildPartitionDesc - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: hyrax vs. RelationBuildPartitionDesc |
Date | |
Msg-id | 23284.1555263536@sss.pgh.pa.us Whole thread Raw |
In response to | Re: hyrax vs. RelationBuildPartitionDesc (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>) |
Responses |
Re: hyrax vs. RelationBuildPartitionDesc
|
List | pgsql-hackers |
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > On 2019/04/10 15:42, Michael Paquier wrote: >> It seems to me that Tom's argument to push in the way relcache >> information is handled by copying its contents sounds sensible to me. >> That's not perfect, but it is consistent with what exists (note >> PublicationActions for a rather-still-not-much-complex example of >> structure which gets copied from the relcache). > I gave my vote for direct access of unchangeable relcache substructures > (TupleDesc, PartitionDesc, etc.), because they're accessed during the > planning of every user query and fairly expensive to copy compared to list > of indexes or PublicationActions that you're citing. To further my point > a bit, I wonder why PublicationActions needs to be copied out of relcache > at all? Looking at its usage in execReplication.c, it seems we can do > fine without copying, because we are holding both a lock and a relcache > reference on the replication target relation during the entirety of > apply_handle_insert(), which means that information can't change under us, > neither logically nor physically. So the point here is that that reasoning is faulty. You *cannot* assume, no matter how strong a lock or how many pins you hold, that a relcache entry will not get rebuilt underneath you. Cache flushes happen regardless. And unless relcache.c takes special measures to prevent it, a rebuild will result in moving subsidiary data structures and thereby breaking any pointers you may have pointing into those data structures. For certain subsidiary structures such as the relation tupdesc, we do take such special measures: that's what the "keep_xxx" dance in RelationClearRelation is. However, that's expensive, both in cycles and maintenance effort: it requires having code that can decide equality of the subsidiary data structures, which we might well have no other use for, and which we certainly don't have strong tests for correctness of. It's also very error-prone for callers, because there isn't any good way to cross-check that code using a long-lived pointer to a subsidiary structure is holding a lock that's strong enough to guarantee non-mutation of that structure, or even that relcache.c provides any such guarantee at all. (If our periodic attempts to reduce lock strength for assorted DDL operations don't scare the pants off you in this connection, you have not thought hard enough about it.) So I think that even though we've largely gotten away with this approach so far, it's also a half-baked kluge that we should be looking to get rid of, not extend to yet more cases. To my mind there are only two trustworthy solutions to the problem of wanting time-extended usage of a relcache subsidiary data structure: one is to copy it, and the other is to reference-count it. I think that going over to a reference-count-based approach for many of these structures might well be something we should do in future, maybe even the very near future. In the mean time, though, I'm not really satisfied with inserting half-baked kluges, especially not ones that are different from our other half-baked kluges for similar purposes. I think that's a path to creating hard-to-reproduce bugs. regards, tom lane
pgsql-hackers by date: