Re: VACUUM/ANALYZE counting of in-doubt tuples - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: VACUUM/ANALYZE counting of in-doubt tuples
Date
Msg-id 20071119122020.GA11626@alvh.no-ip.org
Whole thread Raw
In response to VACUUM/ANALYZE counting of in-doubt tuples  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: VACUUM/ANALYZE counting of in-doubt tuples  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:

> On further thought though, that's not the whole story, and in fact
> VACUUM itself isn't doing very well at accounting for in-doubt tuples.
> The current implementation is that whatever live and dead tuple totals
> are arrived at by a VACUUM or ANALYZE are sent to the stats collector
> and simply overwrite its counters on arrival.  Meanwhile, the
> transaction that is responsible for an in-doubt tuple will send a
> stats message to increment either the live-tuple or dead-tuple count
> as appropriate when it commits or aborts.  If that happens before
> the VACUUM or ANALYZE completes, the increment will get overwritten
> by VACUUM/ANALYZE's total; if afterwards, the increment will get
> added onto the total.  So ideally we'd count the state change as already
> done if we knew the other transaction would commit first, otherwise
> not.

How about this: let's have VACUUM send a message at the start of
processing the table.  pgstats saves the current counters for the table
somewhere and resets them to zero; and any transaction that sends
messages after that is counted to the new counter.

When vacuum finishes and commits, it sends another message and pgstats
forgets the counters it saved.  At this point, the count of dead tuples
will be correct.  (If during vacuum anyone retrieves the number of dead
tuples, the logical thing would be to report the saved counter).

If vacuum aborts, it sends a message saying so and pgstats restores the
saved counter, adding whatever has been accumulated on the other counter
during the vacuum.

If the system crashes there is no problem because the stats are reset
anyway.

-- 
Alvaro Herrera                 http://www.amazon.com/gp/registry/CTMLCN8V17R4
Este mail se entrega garantizadamente 100% libre de sarcasmo.


pgsql-hackers by date:

Previous
From: Zdenek Kotala
Date:
Subject: Re: Spinlock backoff algorithm
Next
From: Alvaro Herrera
Date:
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris