On Thu, Dec 26, 2019 at 12:46:39PM +0900, Kyotaro Horiguchi wrote:
> At Wed, 25 Dec 2019 16:15:21 -0800, Noah Misch <noah@leadboat.com> wrote in
> > Skip AssertPendingSyncs_RelationCache() at abort, like v24nm did. Making
> > that work no matter what does ereport(ERROR) would be tricky and low-value.
>
> Right about ereport, but I'm not sure remove the whole assertion from abort.
You may think of a useful assert location that lacks the problems of asserting
at abort. For example, I considered asserting in PortalRunMulti() and
PortalRun(), just after each command, if still in a transaction.
> > - Reverted most post-v24nm changes to swap_relation_files(). Under
> > "-DRELCACHE_FORCE_RELEASE", relcache.c quickly discards the
> > rel1->rd_node.relNode update. Clearing rel2->rd_createSubid is not right if
> > we're running CLUSTER for the second time in one transaction. I used
>
> I don't agree to that. As I think I have mentioned upthread, rel2 is
> wrongly marked as "new in this tranction" at that time, which hinders
> the opportunity of removal and such entries wrongly persist for the
> backend life and causes problems. (That was found by abort-time
> AssertPendingSyncs_RelationCache()..)
I can't reproduce rel2's relcache entry wrongly persisting for the life of a
backend. If that were happening, I would expect repeating a CLUSTER command N
times to increase hash_get_num_entries(RelationIdCache) by at least N. I
tried that, but hash_get_num_entries(RelationIdCache) did not increase. In a
non-assert build, how can I reproduce problems caused by incorrect
rd_createSubid on rel2?