Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum |
Date | |
Msg-id | 4D710ABD.5090103@enterprisedb.com Whole thread Raw |
In response to | Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum (Fujii Masao <masao.fujii@gmail.com>) |
Responses |
Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
|
List | pgsql-hackers |
On 04.03.2011 11:00, daveg wrote: > Thanks, I've applied both patches to one host. I'll probably have to back > down on the debugging logging soon, as the output is pretty voluminious, > it is producing 100MB of message log every few minutes. I'll try Merlins > patch to get the case setting the bit first though. > > Anyway, here is a snippit of log with the setting and unsetting of the > bit on one page. Unfortunately, we don't know which of a dozen odd databases > the page belongs to, but the timestamps are so close it seems likely to be > the same one. I've added dbname to the patch and will get that next time > I can switch binaries. > > 2011-03-03 23:28:34.170 PST 2039 WARNING: debugging: setting PD_ALL_VISIBLE in relation "pg_statistic" on page 5963(OldestXmin 331848998) > ... > /cv/logs/production_03-20110303_232519.log.gz:2011-03-03 23:29:34.194 PST 2115 WARNING: PD_ALL_VISIBLE flag was incorrectlyset in relation "pg_statistic" page 5963 (OldestXmin 331677178) > ... > 2011-03-03 23:42:38.323 PST 2808 WARNING: debugging: setting PD_ALL_VISIBLE in relation "pg_attribute" on page 5963(OldestXmin 331677178) Hmm, if these all came from the same database, then it looks OldestXmin has moved backwards. That would explain the warnings. First one vacuum determines that all the tuples are visible to everyone and sets the flag. Then another vacuum runs with an older OldestXmin, and determines that there's a tuple on the page with an xmin that is not yet visible to everyone, hence it thinks that the flag should not have been set yet. Looking at the code, I don't see how that situation could arise, though. The value calculated by GetOldestXmin() should never move backwards. And GetOldestXmin() is called in lazy_vacuum_rel(), after it has acquired a lock on the table, which should protect from a race condition where two vacuums could run on the table one after another, in a way where the later vacuum runs with an OldestXmin calculated before the first vacuum. Hmm, fiddling with vacuum_defer_cleanup_age on the fly could cause that, though. You don't do that, do you? > Also, I've attached the relevent page image. Thanks. There seems to be two tuples on the page, both of were HOT updated at some point, but now there's only one version of each left: postgres=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_ctid, t_infomask, t_infomask2, t_hoff FROM heap_page_items(loread(lo_open(29924, 262144), 8192)) WHERE lp_flags <> 0; lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_ctid | t_infomask | t_infomask2 | t_hoff ----+--------+----------+--------+-----------+--------+----------+------------+-------------+-------- 1 | 7608 | 1 | 580 | 331250141 | 0 | (5963,1) | 10499 | -32747 | 32 3 | 1 | 2 | 0 | | | | | | 4 | 7528 | 1 | 76 | 331735553 | 0 | (5963,4) | 10497 | -32747 | 32 19 | 4 | 2 | 0 | | | | | | (4 rows) Deciphering those infomasks, the first tuple at lp 1 has these flags set: HEAP_UPDATED | HEAP_XMAX_INVALID | HEAP_XMIN_COMMITTED | HEAP_HASNULL | HEAP_HASVARWIDTH And the 2nd one at lp 4: HEAP_UPDATED | HEAP_XMAX_INVALID | HEAP_XMIN_COMMITTED | HEAP_HASNULL So, both of those tuples are live. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: