Re: Inaccuracy in VACUUM's tuple count estimates - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Inaccuracy in VACUUM's tuple count estimates
Date
Msg-id 20140609165529.GE8406@alap3.anarazel.de
Whole thread Raw
In response to Re: Inaccuracy in VACUUM's tuple count estimates  (Kevin Grittner <kgrittn@ymail.com>)
Responses Re: Inaccuracy in VACUUM's tuple count estimates  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Inaccuracy in VACUUM's tuple count estimates  (Kevin Grittner <kgrittn@ymail.com>)
List pgsql-hackers
On 2014-06-09 09:45:12 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> >     HEAPTUPLE_INSERT_IN_PROGRESS,    /* inserting xact is still in progress */
> >     HEAPTUPLE_DELETE_IN_PROGRESS    /* deleting xact is still in progress */
> > the current code will return INSERT_IN_PROGRESS even if the tuple has
> > *also* been deleted in another xact...
> > I think the problem here is that there's simply no way to really
> > represent that case accurately with the current API.
> 
> For purposes of predicate.c, if the "also deleted" activity might
> be rolled back without rolling back the insert, INSERT_IN_PROGRESS
> is the only correct value.  If they will either both commit or
> neither will commit, predicate.c would be more efficient if
> HEAPTUPLE_RECENTLY_DEAD was returned, but I
> HEAPTUPLE_INSERT_IN_PROGRESS would be OK from a correctness PoV.

That's basically the argument for the new behaviour.

But I am not sure, given predicate.c's coding, how
HEAPTUPLE_DELETE_IN_PROGRESS could cause problems. Could you elaborate,
since that's the contentious point with Tom? Since 'both in progress'
can only happen if xmin and xmax are the same toplevel xid and you
resolve subxids to toplevel xids I think it should currently be safe
either way?

> >     HEAPTUPLE_RECENTLY_DEAD,    /* tuple is dead, but not deletable yet */
> > 1) xmin has committed, xmax has committed and wasn't only a locker. But
> > xmax doesn't precede OldestXmin.
> 
> For my purposes, it would be better if this also included:
>  2) xmin is in progress, xmax matches (or includes) xmin
> 
> ... but that would be only a performance tweak.

I don't see that happening as there's several callers for which it is
important to know whether the xacts are still alive or not.

> >     HEAPTUPLE_DELETE_IN_PROGRESS    /* deleting xact is still in progress */
> > new:
> > 1) xmin has committed, xmax is in progress, xmax is not just a locker
> > 2) xmin is in progress, xmin is the current backend, xmax is not just a
> >   locker and in progress.
> 
> I'm not clear on how 2) could happen unless xmax is the current
> backend or a subtransaction thereof.  Could you clarify?
> 
> > old:
> > 1) xmin has committed, xmax is in progress, xmax is not just a locker
> > 2) xmin is in progress, xmax is set and not not just a locker
> >
> > Note that the 2) case here never checked xmax's status.
> 
> Again, I'm not sure how 2) could happen unless they involve the
> same top-level transaction.  What am I missing?

Right, both can only happen if the tuple is created & deleted in the
same backend. Is that in contradiction to something you see?

Andres



pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: Inaccuracy in VACUUM's tuple count estimates
Next
From: Claudio Freire
Date:
Subject: Re: Extended Prefetching using Asynchronous IO - proposal and patch