Re: Inaccuracy in VACUUM's tuple count estimates - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Inaccuracy in VACUUM's tuple count estimates
Date
Msg-id 20140611125411.GV8406@alap3.anarazel.de
Whole thread Raw
In response to Re: Inaccuracy in VACUUM's tuple count estimates  (Kevin Grittner <kgrittn@ymail.com>)
Responses Re: Inaccuracy in VACUUM's tuple count estimates  (Kevin Grittner <kgrittn@ymail.com>)
List pgsql-hackers
On 2014-06-09 11:24:22 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2014-06-09 09:45:12 -0700, Kevin Grittner wrote:
> 
> > I am not sure, given predicate.c's coding, how
> > HEAPTUPLE_DELETE_IN_PROGRESS could cause problems. Could you elaborate,
> > since that's the contentious point with Tom? Since 'both in
> > progress'
> > can only happen if xmin and xmax are the same toplevel xid and you
> > resolve subxids to toplevel xids I think it should currently be safe
> > either way?
> 
> The only way that it could be a problem is if the DELETE is in a
> subtransaction which might get rolled back without rolling back the
> INSERT.

The way I understand the code in that case the subxid in xmax would have
been resolved the toplevel xid.
/* * Find top level xid.  Bail out if xid is too early to be a conflict, or * if it's our own xid. */if
(TransactionIdEquals(xid,GetTopTransactionIdIfAny()))    return;xid = SubTransGetTopmostTransaction(xid);if
(TransactionIdPrecedes(xid,TransactionXmin))    return;if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
return;

That should essentially make that case harmless, right? So it seems the
optimization (and pessimization in other cases) of only tracking
toplevel xids seems to save the day here?

>  If we ignore the conflict because we assume the INSERT
> will be negated by the DELETE, and that doesn't happen, we would
> get false negatives which would compromise correctness.  If we
> assume that the DELETE might not happen when the DELETE is not in a
> separate subtransaction we might get a false positive, which would
> only be a performance hit.  If we know either is possible and have
> a way to check in predicate.c, it's fine to check it there.

Given the above I don't think this currently can happen. Am I understand
it correctly? If so, it certainly deserves a comment...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: replication commands and log_statements
Next
From: Andres Freund
Date:
Subject: Re: replication commands and log_statements