On Tue, Jan 10, 2023 at 10:50 AM Robert Haas <robertmhaas@gmail.com> wrote:
> Look, I don't want to spend time arguing about what seem to me to be
> basic principles of good software engineering. When I don't put test
> cases into my patches, people complain at me and tell me that I'm a
> bad software engineer because I didn't include test cases. Your
> argument here seems to be that you're such a good software engineer
> that you don't need any test cases to know what the bug is or that
> you've fixed it correctly.
That's not what I said. This is a straw man.
What I actually said was that there is no reason to declare up front
that the only circumstances under which a fix could be committed is
when a clean repro is available. I never said that a test case has
little or no value, and I certainly didn't assert that we definitely
don't need a test case to proceed with a commit -- since I am not in
the habit of presumptuously attaching conditions to such things well
in advance.
> I don't particularly appreciate the implication that I either lack
> relevant or expertise or don't actually take time, either.
The implication was only that you didn't take the time. Clearly you
have the expertise. Obviously you're very far from stupid.
I have been unable to reproduce the problem, and think it's possible
that the issue cannot be triggered in practice. Though only through
sheer luck. Here's why that is:
While pruning will remove aborted dead tuples, freezing will not
remove the xmax of an aborted update unless the XID happens to be <
OldestXmin. With my problem scenario, the page will be all_visible in
prunestate, but not all_frozen -- so it dodges the relevant
visibilitymap_set() call site.
That just leaves inserts that abort, I think. An aborted insert will
be totally undone by pruning, but that does still leave behind an
LP_DEAD item that needs to be vacuumed in the second heap pass. This
means that we can only set the page all-visible/all-frozen in the VM
in the second heap pass, which also dodges the relevant
visibilitymap_set() call site.
In summary, I think that there is currently no way that we can have
the VM (or the PD_ALL_VISIBLE flag) concurrently unset, while leaving
the page all_frozen. It can happen and leave the page all_visible, but
not all_frozen, due to these very fine details. (Assuming I haven't
missed another path to the problem with aborted Multis or something,
but looks like I haven't.)
--
Peter Geoghegan