Fix relid-set clobber during join removal.
Commit cfcd57111 et al fell over under Valgrind testing.
(It seems to be enough to #define USE_VALGRIND, you don't actually
need to run it under Valgrind to see failures.) The cause is that
remove_rel_from_eclass updates each EquivalenceMember's em_relids,
and those can be aliases of the left_relids or right_relids of some
RestrictInfo in ec_sources. If the update made em_relids empty then
bms_del_member will have pfree'd the relid set, so that the subsequent
attempt to clean up ec_sources accesses already-freed memory.
We missed seeing ill effects before cfcd57111 because (a) if the
pfree happens then we will remove the EquivalenceMember altogether,
making the source RestrictInfo no longer of use, and (b) the
cleanup of ec_sources didn't touch left/right_relids before that.
I'm unclear though on how cfcd57111 managed to pass non-USE_VALGRIND
testing. Apparently we managed to store another Bitmapset into the
freed space before trying to access it, but you'd not think that would
happen 100% of the time. I think what USE_VALGRIND changes is that it
makes list.c much more memory-hungry, so that the freed space gets
claimed by some List node before a Bitmapset can be put there.
This failure can be seen in v16, v17, and master, but oddly enough not
v18. That's because the SJE patch replaced the simple bms_del_members
calls used here with adjust_relid_set, which is careful not to
scribble on its input. But commit 20efbdffe just recently put back
the old coding and thus resurrected the problem.
Discussion: https://postgr.es/m/458729.1776724816@sss.pgh.pa.us
Backpatch-through: 16, 17, master
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/f0ac6d494b56b83cf49d328ee0c5dd20df937fce
Modified Files
--------------
src/backend/optimizer/plan/analyzejoins.c | 2 ++
1 file changed, 2 insertions(+)