Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < - Mailing list pgsql-committers

From Andres Freund
Subject Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Date
Msg-id 20160616153745.upxilwj3spbi43dj@alap3.anarazel.de
Whole thread Raw
In response to Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <  (Kevin Grittner <kgrittn@gmail.com>)
Responses Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-committers
On 2016-06-16 09:50:09 -0500, Kevin Grittner wrote:
> On Wed, Jun 15, 2016 at 8:57 PM, Andres Freund <andres@anarazel.de> wrote:
>
> > The simplest version of the scenario I'm concerned about is that a tuple
> > in a tuple is *not* vacuumed, even though it's elegible to be removed
> > due to STO. If that tuple has toasted columns, it could be the that the
> > toast table was independently vacuumed (autovac considers main/toast
> > tables separately,
>
> If that were really true, why would we not have the problem in
> current production versions that the toast table could be vacuumed
> before the heap, leading to exactly the issue you are talking
> about?

The issue isn't there without the feature, because we (should) never
access a tuple/detoast a column when it's invisible enough for the
corresponding toast tuple to be vacuumed away. But with
old_snapshot_timeout that's obviously (intentionally) not the case
anymore.  Due to old_snapshot_threshold we'll prune tuples which,
without it, would still be considered HEAPTUPLE_RECENTLY_DEAD.


> It seems to me that a simple test shows that it is not the
> case that the heap is vacuumed without considering toast:

That's why I mentioned autovacuum:
    /*
     * Scan pg_class to determine which tables to vacuum.
     *
     * We do this in two passes: on the first one we collect the list of plain
     * relations and materialized views, and on the second one we collect
     * TOAST tables. The reason for doing the second pass is that during it we
     * want to use the main relation's pg_class.reloptions entry if the TOAST
     * table does not have any, and we cannot obtain it unless we know
     * beforehand what's the main  table OID.
     *
     * We need to check TOAST tables separately because in cases with short,
     * wide tables there might be proportionally much more activity in the
     * TOAST table than in its parent.
     */
...
        tab->at_vacoptions = VACOPT_SKIPTOAST |
            (dovacuum ? VACOPT_VACUUM : 0) |
            (doanalyze ? VACOPT_ANALYZE : 0) |
            (!wraparound ? VACOPT_NOWAIT : 0);
(note the skiptoast)
...
    /*
     * Remember the relation's TOAST relation for later, if the caller asked
     * us to process it.  In VACUUM FULL, though, the toast table is
     * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
     */
    if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))
        toast_relid = onerel->rd_rel->reltoastrelid;
    else
        toast_relid = InvalidOid;
...
    if (toast_relid != InvalidOid)
        vacuum_rel(toast_relid, relation, options, params);


> > or the horizon could change between the two heap scans,
>
> Not a problem in current production why?

Because the horizon will never go to a value which allows "surely dead"
tuples to be read, thus we never detoast columns from a tuple for which
we'd removed toast data. That's why we're performing visibility tests
(hopefully) everywhere, before accessing tuple contents (as opposed to
inspecting the header).


> > or pins could prevent vacuuming of one page, or ...).
>
> Not a problem in current production why?

Same reason.


> So far I am not seeing any way for TOAST tuples to be pruned in
> advance of the referencing heap tuples, nor any way for the problem
> you describe to happen in the absence of that.

Didn't I just list three different ways, only one of which you doubted
the veracity of? Saying "Not a problem in current production why"
doesn't change it being a problem.


> > It seems the easiest way to fix this would be to make
> > TestForOldSnapshot() "accept" SnapshotToast as well.
>
> I don't think that would be appropriate without testing the
> characteristics of the underlying table to see whether it should be
> excluded.

You mean checking whether it's a toast table? We could check that, but
since we never use a toast scan outside of toast, it doesn't seem
necessary.


> But is the TOAST data ever updated without an update to
> the referencing heap tuple?

It shouldn't.


> If not, I don't see any benefit.

Huh?


Greetings,

Andres Freund


pgsql-committers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <