Thread: Pruning never visible changes

Pruning never visible changes

From

Simon Riggs

Date:

16 September 2022, 15:33:23

A user asked me whether we prune never visible changes, such as

BEGIN;
INSERT...
UPDATE.. (same row)
COMMIT;

Once committed, the original insert is no longer visible to anyone, so
"ought to be able to be pruned", sayeth the user. And they also say
that changing the app is much harder, as ever.

After some thought, Yes, we can prune, but not in all cases - only if
the never visible tuple is at the root end of the update chain. The
only question is can that be done cheaply enough to bother with. The
answer in one specific case is Yes, in other cases No.

This patch adds a new test for this use case, and code to remove the
never visible row when the changes are made by the same xid.

(I'm pretty sure there used to be a test for this some years back and
I'm guessing it was removed because it isn't always possible to remove
the tuple, which this new patch honours.)

Please let me know what you think.

--
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

never_visible.v1.patch

Re: Pruning never visible changes

From

Tom Lane

Date:

16 September 2022, 17:26:29

Simon Riggs <simon.riggs@enterprisedb.com> writes:
> A user asked me whether we prune never visible changes, such as
> BEGIN;
> INSERT...
> UPDATE.. (same row)
> COMMIT;

Didn't we just have this discussion in another thread?  You cannot
do that, at least not without checking that the originating
transaction has no snapshots that could see the older row version.
I'm not sure whether or not snapmgr.c has enough information to
determine that, but in any case this formulation is surely
unsafe, because it isn't even checking whether that transaction is
our own, let alone asking snapmgr.c.

I'm dubious that a safe version would fire often enough to be
worth the cycles spent.

            regards, tom lane

Re: Pruning never visible changes

From

Simon Riggs

Date:

16 September 2022, 19:56:42

On Fri, 16 Sept 2022 at 15:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Simon Riggs <simon.riggs@enterprisedb.com> writes:
> > A user asked me whether we prune never visible changes, such as
> > BEGIN;
> > INSERT...
> > UPDATE.. (same row)
> > COMMIT;
>
> Didn't we just have this discussion in another thread?

Not that I was aware of, but it sounds like a different case anyway.

> You cannot
> do that, at least not without checking that the originating
> transaction has no snapshots that could see the older row version.

I'm saying this is possible only AFTER the end of the originating
xact, so there are no issues with additional snapshots.

i.e. the never visible row has to be judged RECENTLY_DEAD before we even check.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Pruning never visible changes

From

Tom Lane

Date:

16 September 2022, 20:37:34

Simon Riggs <simon.riggs@enterprisedb.com> writes:
> On Fri, 16 Sept 2022 at 15:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> You cannot
>> do that, at least not without checking that the originating
>> transaction has no snapshots that could see the older row version.

> I'm saying this is possible only AFTER the end of the originating
> xact, so there are no issues with additional snapshots.

I see, so the point is just that we can prune even if the originating
xact hasn't yet passed the global xmin horizon.  I agree that's safe,
but will it fire often enough to be worth the trouble?  Also, why
does it need to be restricted to certain shapes of HOT chains ---
that is, why can't we do exactly what we'd do if the xact *were*
past the xmin horizon?

            regards, tom lane

Re: Pruning never visible changes

From

Laurenz Albe

Date:

16 September 2022, 23:07:48

On Fri, 2022-09-16 at 10:26 -0400, Tom Lane wrote:
> Simon Riggs <simon.riggs@enterprisedb.com> writes:
> > A user asked me whether we prune never visible changes, such as
> > BEGIN;
> > INSERT...
> > UPDATE.. (same row)
> > COMMIT;
> 
> Didn't we just have this discussion in another thread?

For reference: that was
https://postgr.es/m/f6a491b32cb44bb5daaafec835364f7149348fa1.camel@cybertec.at

Yours,
Laurenz Albe

Re: Pruning never visible changes

From

Simon Riggs

Date:

17 September 2022, 09:33:07

On Fri, 16 Sept 2022 at 21:07, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Fri, 2022-09-16 at 10:26 -0400, Tom Lane wrote:
> > Simon Riggs <simon.riggs@enterprisedb.com> writes:
> > > A user asked me whether we prune never visible changes, such as
> > > BEGIN;
> > > INSERT...
> > > UPDATE.. (same row)
> > > COMMIT;
> >
> > Didn't we just have this discussion in another thread?
>
> For reference: that was
> https://postgr.es/m/f6a491b32cb44bb5daaafec835364f7149348fa1.camel@cybertec.at

Thanks. I confirm I hadn't seen that, and indeed, I wrote the patch on
5 Sept before you posted.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Pruning never visible changes

From

Simon Riggs

Date:

17 September 2022, 09:46:58

On Fri, 16 Sept 2022 at 18:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Simon Riggs <simon.riggs@enterprisedb.com> writes:
> > On Fri, 16 Sept 2022 at 15:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> You cannot
> >> do that, at least not without checking that the originating
> >> transaction has no snapshots that could see the older row version.
>
> > I'm saying this is possible only AFTER the end of the originating
> > xact, so there are no issues with additional snapshots.
>
> I see, so the point is just that we can prune even if the originating
> xact hasn't yet passed the global xmin horizon.  I agree that's safe,
> but will it fire often enough to be worth the trouble?

It is an edge case with limited utility, I agree, which is why I
looked for a cheap test (xmin == xmax only).

This additional test is also valid, but too expensive to apply:
TransactionIdGetTopmostTranactionId(xmax) ==
TransactionIdGetTopmostTranactionId(xmin)

> Also, why
> does it need to be restricted to certain shapes of HOT chains ---
> that is, why can't we do exactly what we'd do if the xact *were*
> past the xmin horizon?

I thought it important to maintain the integrity of the HOT chain, in
that the xmax of one tuple version matches the xmin of the next. So
pruning only from the root of the chain allows us to maintain that
validity check.

I'm on the fence myself, which is why I didn't post it immediately I
had written it.

If not, it would be useful to add a note in comments to the code to
explain why we don't do this.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Pruning never visible changes

From

Greg Stark

Date:

19 September 2022, 02:16:00

On Fri, 16 Sept 2022 at 10:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Simon Riggs <simon.riggs@enterprisedb.com> writes:
> > A user asked me whether we prune never visible changes, such as
> > BEGIN;
> > INSERT...
> > UPDATE.. (same row)
> > COMMIT;
>
> Didn't we just have this discussion in another thread?

Well.....  not "just" :)

commit 44e4bbf75d56e643b6afefd5cdcffccb68cce414
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Fri Apr 29 16:29:42 2011 -0400

    Remove special case for xmin == xmax in HeapTupleSatisfiesVacuum().

    VACUUM was willing to remove a committed-dead tuple immediately if it was
    deleted by the same transaction that inserted it.  The idea is that such a
    tuple could never have been visible to any other transaction, so we don't
    need to keep it around to satisfy MVCC snapshots.  However, there was
    already an exception for tuples that are part of an update chain, and this
    exception created a problem: we might remove TOAST tuples (which are never
    part of an update chain) while their parent tuple stayed around (if it was
    part of an update chain).  This didn't pose a problem for most things,
    since the parent tuple is indeed dead: no snapshot will ever consider it
    visible.  But MVCC-safe CLUSTER had a problem, since it will try to copy
    RECENTLY_DEAD tuples to the new table.  It then has to copy their TOAST
    data too, and would fail if VACUUM had already removed the toast tuples.

    Easiest fix is to get rid of the special case for xmin == xmax.  This may
    delay reclaiming dead space for a little bit in some cases, but it's by far
    the most reliable way to fix the issue.

    Per bug #5998 from Mark Reid.  Back-patch to 8.3, which is the oldest
    version with MVCC-safe CLUSTER.

Re: Pruning never visible changes

From

Matthias van de Meent

Date:

19 September 2022, 18:29:14

On Mon, 19 Sept 2022 at 01:16, Greg Stark <stark@mit.edu> wrote:
>
> On Fri, 16 Sept 2022 at 10:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Simon Riggs <simon.riggs@enterprisedb.com> writes:
> > > A user asked me whether we prune never visible changes, such as
> > > BEGIN;
> > > INSERT...
> > > UPDATE.. (same row)
> > > COMMIT;
> >
> > Didn't we just have this discussion in another thread?
>
> Well.....  not "just" :)

This recent thread [0] mentioned the same, and I mentioned it in [1]
too last year.

Kind regards,

Matthias van de Meent

[0] https://www.postgresql.org/message-id/flat/2031521.1663076724%40sss.pgh.pa.us#01542683a64a312e5c21541fecd50e63
Subject: Re: Tuples inserted and deleted by the same transaction
Date: 2022-09-13 14:13:44

[1] https://www.postgresql.org/message-id/CAEze2Whjnhg96Wt2-DxtBydhmMDmVm2WfWOX3aGcB2C2Hbry0Q%40mail.gmail.com
Subject: Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Date: 2021-06-14 09:53:47
(in a thread about a PS comment)

Re: Pruning never visible changes

From

Simon Riggs

Date:

22 September 2022, 16:10:07

On Mon, 19 Sept 2022 at 00:16, Greg Stark <stark@mit.edu> wrote:
>
> On Fri, 16 Sept 2022 at 10:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Simon Riggs <simon.riggs@enterprisedb.com> writes:
> > > A user asked me whether we prune never visible changes, such as
> > > BEGIN;
> > > INSERT...
> > > UPDATE.. (same row)
> > > COMMIT;
> >
> > Didn't we just have this discussion in another thread?
>
> Well.....  not "just" :)
>
> commit 44e4bbf75d56e643b6afefd5cdcffccb68cce414
> Author: Tom Lane <tgl@sss.pgh.pa.us>
> Date:   Fri Apr 29 16:29:42 2011 -0400
>
>     Remove special case for xmin == xmax in HeapTupleSatisfiesVacuum().
>
>     VACUUM was willing to remove a committed-dead tuple immediately if it was
>     deleted by the same transaction that inserted it.  The idea is that such a
>     tuple could never have been visible to any other transaction, so we don't
>     need to keep it around to satisfy MVCC snapshots.  However, there was
>     already an exception for tuples that are part of an update chain, and this
>     exception created a problem: we might remove TOAST tuples (which are never
>     part of an update chain) while their parent tuple stayed around (if it was
>     part of an update chain).  This didn't pose a problem for most things,
>     since the parent tuple is indeed dead: no snapshot will ever consider it
>     visible.  But MVCC-safe CLUSTER had a problem, since it will try to copy
>     RECENTLY_DEAD tuples to the new table.  It then has to copy their TOAST
>     data too, and would fail if VACUUM had already removed the toast tuples.
>
>     Easiest fix is to get rid of the special case for xmin == xmax.  This may
>     delay reclaiming dead space for a little bit in some cases, but it's by far
>     the most reliable way to fix the issue.
>
>     Per bug #5998 from Mark Reid.  Back-patch to 8.3, which is the oldest
>     version with MVCC-safe CLUSTER.

Good research Greg, thank you. Only took 10 years for me to notice it
was gone ;-)

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Pruning never visible changes

From

Alvaro Herrera

Date:

22 September 2022, 17:15:50

On 2022-Sep-22, Simon Riggs wrote:

> On Mon, 19 Sept 2022 at 00:16, Greg Stark <stark@mit.edu> wrote:

> >     VACUUM was willing to remove a committed-dead tuple immediately if it was
> >     deleted by the same transaction that inserted it.  The idea is that such a
> >     tuple could never have been visible to any other transaction, so we don't
> >     need to keep it around to satisfy MVCC snapshots.  However, there was
> >     already an exception for tuples that are part of an update chain, and this
> >     exception created a problem: we might remove TOAST tuples (which are never
> >     part of an update chain) while their parent tuple stayed around (if it was
> >     part of an update chain).  This didn't pose a problem for most things,
> >     since the parent tuple is indeed dead: no snapshot will ever consider it
> >     visible.  But MVCC-safe CLUSTER had a problem, since it will try to copy
> >     RECENTLY_DEAD tuples to the new table.  It then has to copy their TOAST
> >     data too, and would fail if VACUUM had already removed the toast tuples.

> Good research Greg, thank you. Only took 10 years for me to notice it
> was gone ;-)

But this begs the question: is the proposed change safe, given that
ancient consideration?  I don't think TOAST issues have been mentioned
in this thread so far, so I wonder if there is a test case that verifies
that this problem doesn't occur for some reason.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: Pruning never visible changes

From

Simon Riggs

Date:

22 September 2022, 23:04:35

On Thu, 22 Sept 2022 at 15:16, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2022-Sep-22, Simon Riggs wrote:
>
> > On Mon, 19 Sept 2022 at 00:16, Greg Stark <stark@mit.edu> wrote:
>
> > >     VACUUM was willing to remove a committed-dead tuple immediately if it was
> > >     deleted by the same transaction that inserted it.  The idea is that such a
> > >     tuple could never have been visible to any other transaction, so we don't
> > >     need to keep it around to satisfy MVCC snapshots.  However, there was
> > >     already an exception for tuples that are part of an update chain, and this
> > >     exception created a problem: we might remove TOAST tuples (which are never
> > >     part of an update chain) while their parent tuple stayed around (if it was
> > >     part of an update chain).  This didn't pose a problem for most things,
> > >     since the parent tuple is indeed dead: no snapshot will ever consider it
> > >     visible.  But MVCC-safe CLUSTER had a problem, since it will try to copy
> > >     RECENTLY_DEAD tuples to the new table.  It then has to copy their TOAST
> > >     data too, and would fail if VACUUM had already removed the toast tuples.
>
> > Good research Greg, thank you. Only took 10 years for me to notice it
> > was gone ;-)
>
> But this begs the question: is the proposed change safe, given that
> ancient consideration?  I don't think TOAST issues have been mentioned
> in this thread so far, so I wonder if there is a test case that verifies
> that this problem doesn't occur for some reason.

Oh, completely agreed.

I will submit a modified patch that adds a test case and just a
comment to explain why we can't remove such rows.

-- 
Simon Riggs                http://www.EnterpriseDB.com/