Error correction for n_dead_tuples - Mailing list pgsql-hackers

From ITAGAKI Takahiro
Subject Error correction for n_dead_tuples
Date
Msg-id 20070516110750.5A5F.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Whole thread Raw
Responses Re: Error correction for n_dead_tuples  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I'm concerned that this one creates
> an open-loop behavior in which the n_dead_tuples estimate will diverge
> arbitrarily far from reality over time.  I criticized the original
> proposal on that basis, and I'm not convinced this version fixes it,
> because of the fact that stats counter updates occur much later than the
> actions they count.  (My recent patch to rate-limit tabstat messages made
> that problem worse, but it existed anyway.)  What might make sense is for
> vacuum to count the number of dead-but-not-removable tuples it skips over,
> and apply that as the value of n_dead_tuples on receipt of the vacuum
> message (instead of setting to zero as now).  This is likely to be wrong
> with respect to the actions of transactions running concurrently with the
> vacuum, but I think so is the proposed patch; and at least in this form
> the error certainly cannot accumulate across vacuum cycles.

In my understanding, there are two proposal to change the way of updating
n_dead_tuples by vacuum presently:
 Set n_dead_tuples to the number of    1. unvacuumable tuples the vacuum has seen   2. dead tuples reported to stats
collectorafter the beginning of vacuum at the end of vacuum. 

Both methods don't accumulate errors across vacuum cycles, because dead
tuples statistics at the beginning of vacuum is cleared in both of them.
Also, if there is no background updates, the n_dead_tuples is certainly
set to zero.

I think the 2nd is better. If we update or delete tuples in the pages
that have been already scanned by vacuum, the vaccum cannot see the
newly created dead tuples. The vacuum could report fewer number as
the unvacuumable tuples.

The following is a test with a patch of the 2nd fix. The vacuum reports
'960 dead row versions cannot be removed yet', that is used in 1st method,
but the actual dead tuples are 2000, that is used in 2nd method.
...so I'll propose the 2nd method again.

Comments welcome.


# SELECT n_live_tup, n_dead_tup FROM pg_stat_user_tables WHERE relname = 'accounts';n_live_tup | n_dead_tup
------------+------------    100000 |          0

# vacuum verbose accounts; (and 'pgbench -n -N -t2000' concurrently)
INFO:  vacuuming "public.accounts"
INFO:  index "accounts_pkey" now contains 100778 row versions in 276 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.70 sec.
INFO:  "accounts": found 0 removable, 100778 nonremovable row versions in 1679 pages
DETAIL:  960 dead row versions cannot be removed yet.
There were 1626 unused item pointers.
876 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 5.78 sec.
VACUUM

# SELECT n_live_tup, n_dead_tup FROM pg_stat_user_tables WHERE relname = 'accounts';n_live_tup | n_dead_tup
------------+------------    100000 |       2000

# SELECT tuple_count, dead_tuple_count FROM pgstattuple('accounts');tuple_count | dead_tuple_count
-------------+------------------     100000 |             2000

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Robert Treat
Date:
Subject: Re: Not ready for 8.3
Next
From: "Marc G. Fournier"
Date:
Subject: Re: 8.3 pending patch queue