Re: Toast issues with OldestXmin going backwards - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Toast issues with OldestXmin going backwards
Date
Msg-id CAA4eK1LW+a60=CPJHtWYBfa5ifX3FFiQz=AOQy7zn0Z-ytqmaQ@mail.gmail.com
Whole thread Raw
In response to Re: Toast issues with OldestXmin going backwards  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Responses Re: Toast issues with OldestXmin going backwards
List pgsql-hackers
On Mon, Apr 23, 2018 at 1:34 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
>>>>>> "Amit" == Amit Kapila <amit.kapila16@gmail.com> writes:
>
>  >> Your patch would actually be needed if (and only if) autovacuum was
>  >> changed back to its old behavior of never vacuuming toast tables
>  >> independently, and if manual VACUUM pg_toast.*; was disabled. But in
>  >> the presence of either of those two possibilities, it does nothing
>  >> useful.
>
>  Amit> Yeah, right, I have missed the point that they can be vacuumed
>  Amit> separately, however, I think that decision is somewhat
>  Amit> questionable.
>
> Some previous discussion links for reference, for the background to the
> thread containing the patch:
>
> https://www.postgresql.org/message-id/flat/87y7gpiqx3.fsf%40oxford.xeocode.com
> https://www.postgresql.org/message-id/flat/20080608230348.GD11028%40alvh.no-ip.org
>

If I read correctly, it seems one of the main reason [1] is to save
the extra pass over the heap and improve the code.  Now, ideally, the
extra pass over heap should also free up some space (occupied by the
rows that contains old toast pointers corresponding to which we are
going to remove rows from toast table), but it is quite possible that
it is already clean due to a previous separate vacuum pass over the
heap.  I think it is good to save extra pass over heap which might not
be as useful as we expect, but that can cost us correctness issues in
boundary cases as in the case being discussed in this thread.

>  Amit> I think it would have been better if along with decoupling of
>  Amit> vacuum for main heap and toast tables, we would have come up with
>  Amit> a way to selectively remove the corresponding rows from the main
>  Amit> heap, say by just vacuuming heap pages/rows which have toast
>  Amit> pointers but maybe that is not viable or involves much more work
>  Amit> without equivalent benefit.
>
> It should be fairly obvious why this is unworkable - most toast-using
> tables will have toast pointers on every page, but without making a
> whole new index of toast pointer OIDs (unacceptable overhead), it would
> be impossible to find the toast pointers pointing to a specific item
> without searching the whole rel (in which case we might just as well
> have vacuumed it).
>

Okay, such an optimization might not be much effective and it would
anyway lead to extra pass over the heap, however, that will resolve
the correctness issue.   Now, I understand that it is not advisable to
go back to the previous behavior for performance concerns, but I think
it would be better if we find a bullet-proof way to fix this symptom,
rather than just fixing the issue reported in this thread.

[1] - From one of the email: "We go certain lengths in autovacuum to
make sure tables are vacuumed when their toast table needs vacuuming
and the main table does not, which is all quite kludgy. So we already
look at their stats and make decisions about them. But what we do
after that is force a vacuum to the main table, even if that one does
not need any vacuuming, which is dumb."

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: David Gould
Date:
Subject: Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.
Next
From: Andres Freund
Date:
Subject: Re: Searching for: Fast windows buildfarm animal