Re: stats for failed transactions (was Re: [GENERAL] VACUUM - Mailing list pgsql-hackers

From Matthew T. O'Connor
Subject Re: stats for failed transactions (was Re: [GENERAL] VACUUM
Date
Msg-id 43DAF7C8.6030304@zeut.net
Whole thread Raw
In response to stats for failed transactions (was Re: [GENERAL] VACUUM Question)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: stats for failed transactions (was Re: [GENERAL] VACUUM Question)
List pgsql-hackers
Tom Lane wrote:
>> hmm... That's true.  I don't think autovacuum doesn't anything to account
>> for the concept of rolledback inserts.
>>     
>
> I think this is the fault of the stats system design.  AFAICT from a
> quick look at the code, inserted/updated/deleted tuples are reported
> to the collector in the same way regardless of whether the sending
> transaction committed or rolled back.  I think this is unquestionably
> a bug, at least for autovacuum's purposes --- though it might be OK
> for the original intent of the stats system, which was simply to track
> activity levels.
>
> Any thoughts about how it ought to work?

The fact that autovacuum bases it's decisions on info from the stats 
system is arguably an abuse of the original design.  However I don't 
know of a better source of information at the moment.  It has always 
been my vision that autovacuum will eventually incorporate additional 
information sources to make better informed decisions.  There has always 
been discussion of using the FSM to help clue us in as to when we need 
another vacuum.  Perhaps the addition of the vacuum space map that 
people are talking about will also help. 

None of this directly addresses the question of what the stats system 
*should* track, but perhaps it is wrongheaded to totally redesign the 
stats system for the purposes of autovacuum.  As a quick semi-fix, 
perhaps autovacuum should look at the number of rollbacks vs. commits in 
an attempt to determine the accuracy of the stats.  For example if 50% 
of the transactions are getting rolled back, then autovacuum might 
include 50% of the inserts in the count towards the vacuum threshold.  
Obviously this isn't perfect, but it probably gets us closer to reality 
with the information already available.

Thoughts?

Matt



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: Proposal: new pg_dump options --copy-delimiter and
Next
From: Tom Lane
Date:
Subject: Re: stats for failed transactions (was Re: [GENERAL] VACUUM Question)