Re: Unexpected VACUUM FULL failure - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Unexpected VACUUM FULL failure
Date
Msg-id 22865.1186810957@sss.pgh.pa.us
Whole thread Raw
In response to Re: Unexpected VACUUM FULL failure  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-hackers
Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> The reason why XLogAsyncCommitFlush() is ugly is that it doesn't
>> actually accomplish a darn thing, as we now see from this failure:
>> it does not guarantee that hint bits will get set, because of the
>> inexact bookkeeping in clog.c.  It might marginally improve the
>> probability that they get set, but that's all.  The reason I want
>> to take it out is mainly that its very existence tempts people to make
>> the same mistake that was made here.

> I don't understand your reasoning here and I would like to understand it so if
> you don't mind I would like to see if I can work out what you're talking
> about.

Well, both Simon and I thought that XLogAsyncCommitFlush would allow
VACUUM FULL to assume that hint bits were good.  We were both wrong
about that, and in hindsight that goes against the whole trend of PG
development on this topic: hint bits are hints, full stop.  I think
that having XLogAsyncCommitFlush in the API will just encourage people
to use it when they shouldn't.

> As far as I understand the Xlog flush in combination with keeping an exclusive
> lock on table and always holding locks until the end of the transaction make
> forcing the hint bits entirely safe.

I don't currently see a hole in that line of reasoning, but it's a bit
longer chain of reasoning than I like for a critical correctness
property.  Especially when we have a boatload of code that violates the
last assumption, including deeply-embedded APIs (heap_close) that make
it easy to violate the assumption.  We could maybe go out next week and
fix all the core code to not release any locks early, but what of
third-party backend add-ons?  Worse, might we not regret the change
later due to increased deadlock risks?

> By "marginally improve the probability" you're making a judgement of the
> probability that programmers will manage to maintain all those properties
> consistently, not about the probability that the race will occur at run-time?

No, I was thinking about the latter.  The current CVS tip code doesn't
have a huge window between logically committing a transaction and being
able to set hint bits for it.  In situations where you think "hmm, I
just made a big update, maybe a VACUUM FULL would be a good idea",
you'll be quite safe by the time you've managed to type VACUUM FULL.
A V.F. that is automatically issued while a lot of other stuff is
going on will be at larger risk of having the defragment step disabled,
but at the same time it's not obvious that anyone will care much.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Unexpected VACUUM FULL failure
Next
From: "Pavel Stehule"
Date:
Subject: Re: regexp_matches and regexp_split are inconsistent