Thread: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
"Devrim GUNDUZ"
Date:
Hi, Does this mean we need to wrap new tarballs soon, because of the data loss bug? Regards, Devrim 14 Eylül 2012, Cuma, 4:42 pm tarihinde, Robert Haas yazmış: > Properly set relpersistence for fake relcache entries. > > This can result in buffers failing to be properly flushed at > checkpoint time, leading to data loss. > > Report, diagnosis, and patch by Jeff Davis. > > Branch > ------ > REL9_1_STABLE > > Details > ------- > http://git.postgresql.org/pg/commitdiff/fef2c17807e095a04441e4f1fe05f75d5578ead2 > > Modified Files > -------------- > src/backend/access/transam/xlogutils.c | 5 +++++ > src/backend/storage/buffer/bufmgr.c | 2 ++ > 2 files changed, 7 insertions(+), 0 deletions(-) > > > -- > Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-committers > -- Temporarily using a webmail program.
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Tom Lane
Date:
"Devrim GUNDUZ" <devrim@gunduz.org> writes: > Does this mean we need to wrap new tarballs soon, because of the data loss > bug? I'll defer to Robert on whether this bug is serious enough to merit a near-term release on its own. But historically, we've wanted to push out a .1 update two or three weeks after a .0 release, to mop up whatever new bugs are nasty enough for people to trip over right away. Maybe we should be thinking in terms of a set of releases around the end of September. regards, tom lane
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Simon Riggs
Date:
On 14 September 2012 17:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Devrim GUNDUZ" <devrim@gunduz.org> writes: >> Does this mean we need to wrap new tarballs soon, because of the data loss >> bug? > > I'll defer to Robert on whether this bug is serious enough to merit a > near-term release on its own. But historically, we've wanted to push > out a .1 update two or three weeks after a .0 release, to mop up > whatever new bugs are nasty enough for people to trip over right away. > Maybe we should be thinking in terms of a set of releases around the > end of September. The bug itself is not major, but the extent and user impact is serious. In my opinion we should issue a new release of 9.1 immediately. 9.2 might have different treatment but does suffer the same problem. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Robert Haas
Date:
On Sep 14, 2012, at 12:17 PM, Simon Riggs <simon@2ndQuadrant.com> wrote: > The bug itself is not major, but the extent and user impact is serious. I don't think I understand how you're using the word major there. I seem to recall some previous disputation between youand I about the use of that term, so maybe it would be good to get that cleared up. To me major and serious mean aboutthe same thing, so it can't for me be one but not the other. Definitions aside, I think it's a pretty scary issue. It basically means that if you have a recovery (crash or archive) duringwhich you read a buffer into memory, the buffer won't be checkpointed. So if, before the buffer is next evicted, youhave a crash, and if at least one checkpoint has intervened between the most recent WAL-logged operation on the bufferand the crash, you're hosed. That's not a terribly unlikely scenario. While I can't claim to understand exactly what our standards for forcing an immediate minor release are, I think this ispretty darn bad. I certainly don't want my customers running with this for a minute longer than necessary, and I feel reallybad for letting it get into a release, let alone go undetected for this long. :-( ...Robert
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > Definitions aside, I think it's a pretty scary issue. It basically means that if you have a recovery (crash or archive)during which you read a buffer into memory, the buffer won't be checkpointed. So if, before the buffer is next evicted,you have a crash, and if at least one checkpoint has intervened between the most recent WAL-logged operation on thebuffer and the crash, you're hosed. That's not a terribly unlikely scenario. This is only an issue on standby slaves or when doing a PITR recovery, no? As far as I can tell from the discussion, it would *not* affect crash recovery, because we don't do restartpoints during crash recovery. > While I can't claim to understand exactly what our standards for forcing an immediate minor release are, I think this ispretty darn bad. I certainly don't want my customers running with this for a minute longer than necessary, and I feel reallybad for letting it get into a release, let alone go undetected for this long. :-( There's been some discussion about it among -core already. The earliest we could possibly do anything would be a release this coming week (that is, wrap Thursday for release Monday 9/24). However, considering that a lot of key people will be attending PG Open between now and Thursday, I'm not sure how practical that really would be. Waiting a week might be better, and it would give more time for initial bug reports against 9.2.0 to filter in. regards, tom lane
Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Andres Freund
Date:
On Saturday, September 15, 2012 06:29:25 PM Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > Definitions aside, I think it's a pretty scary issue. It basically means > > that if you have a recovery (crash or archive) during which you read a > > buffer into memory, the buffer won't be checkpointed. So if, before the > > buffer is next evicted, you have a crash, and if at least one checkpoint > > has intervened between the most recent WAL-logged operation on the > > buffer and the crash, you're hosed. That's not a terribly unlikely > > scenario. > > This is only an issue on standby slaves or when doing a PITR recovery, no? > As far as I can tell from the discussion, it would not affect crash > recovery, because we don't do restartpoints during crash recovery. I think unfortunately it does. At the end of recovery we perform a END_OF_RECOVERY checkpoint that seems to suffer from these issues. While CreateCheckPoint() itself treats that kind of checkpoint similarly to a shutdown checkpoint it doesn't pass that similarity to BufferSync (via CheckPointGuts->CheckPointBuffers). I hope I missed something ... Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Robert Haas
Date:
On Sep 15, 2012, at 11:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Definitions aside, I think it's a pretty scary issue. It basically means that if you have a recovery (crash or archive)during which you read a buffer into memory, the buffer won't be checkpointed. So if, before the buffer is next evicted,you have a crash, and if at least one checkpoint has intervened between the most recent WAL-logged operation on thebuffer and the crash, you're hosed. That's not a terribly unlikely scenario. > > This is only an issue on standby slaves or when doing a PITR recovery, no? > As far as I can tell from the discussion, it would *not* affect crash > recovery, because we don't do restartpoints during crash recovery. No, I think it does affect crash recovery. Whether or not restartspoints happen during recovery doesn't matter; what doesmatter is that after recovery there may be shared buffers that are erroneously not marked as permanent. Such bufferswon't be checkpointed except at shutdown time, which is wrong. ...Robert
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > On Sep 15, 2012, at 11:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> This is only an issue on standby slaves or when doing a PITR recovery, no? >> As far as I can tell from the discussion, it would *not* affect crash >> recovery, because we don't do restartpoints during crash recovery. > No, I think it does affect crash recovery. Whether or not restartspoints happen during recovery doesn't matter; what doesmatter is that after recovery there may be shared buffers that are erroneously not marked as permanent. Such bufferswon't be checkpointed except at shutdown time, which is wrong. Right, but we do a shutdown checkpoint at the end of crash recovery. I could believe that this case gets missed, but it's not clear from what's been said --- and if that case *is* broken, there should have been a whole lot more corruption reports recently than what we've seen. regards, tom lane
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Robert Haas
Date:
On Sep 15, 2012, at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Right, but we do a shutdown checkpoint at the end of crash recovery. Yes, but that only writes the buffers that are dirty. It doesn't fix the lack of a BM_PERMANENT flag on a buffer that oughtto have had one. So that page can now get modified AGAIN, after recovery, and not checkpointed. > I could believe that this case gets missed, but it's not clear from > what's been said --- and if that case *is* broken, there should have > been a whole lot more corruption reports recently than what we've seen. It's pretty clear from Jeff's example at the top of the thread, which involves two crashes and no archive recovery. ...Robert
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > On Sep 15, 2012, at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Right, but we do a shutdown checkpoint at the end of crash recovery. > Yes, but that only writes the buffers that are dirty. It doesn't fix the lack of a BM_PERMANENT flag on a buffer that oughtto have had one. So that page can now get modified AGAIN, after recovery, and not checkpointed. Ugh. Yeah, we need to fix this ASAP. I've notified pgsql-packagers to expect a release this week. regards, tom lane
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Andres Freund
Date:
On Monday, September 17, 2012 07:35:06 AM Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > On Sep 15, 2012, at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Right, but we do a shutdown checkpoint at the end of crash recovery. (as noted somewhere else and tackled by Simon, a END_OF_RECOVERY didn't sync those before) > > Yes, but that only writes the buffers that are dirty. It doesn't fix the > > lack of a BM_PERMANENT flag on a buffer that ought to have had one. So > > that page can now get modified AGAIN, after recovery, and not > > checkpointed. > > Ugh. Yeah, we need to fix this ASAP. I've notified pgsql-packagers to > expect a release this week. Btw, I played with this some more on Saturday and I think, while definitely a bad bug, the actual consequences aren't as bad as at least I initially feared. Fake relcache entries are currently set in 3 scenarios during recovery: 1. removal of ALL_VISIBLE in heapam.c 2. incomplete splits and incomplete deletions in nbtxlog.c 3. incomplete splits in ginxlog.c As Jeff nicely showed its easy to corrupt the visibilitymap with this. Fortunately in < 9.2 that doesn't have too bad consequences and will be fixed by the next vacuum. In 9.2 that obviously can result in wrong query results but will still be fixed by a later (probably full table) vacuum. To hit 2) and 3) the server needs to have crashed (or a strange recovery_target_* set) while doing a multilevel operation. I have only cursorily looked at gin but it looks to me like in both, nbtree and gin, the window during logging the multiple steps in a split/deletion is fairly short and its not likely that we crashed exactly during that. Unless we haven't read the complete multistep operation during recovery we won't ever create fake relcache entries for those relations/buffers! And even if that edge case is hit it seems somewhat likely that the few pages that are read with the fake entry are still in the cache (the incomplete operation has to have been soon before) and thus won't get the bad relpersistence flag set. So I think while that bug had the possibility of being really bad we were pretty lucky... Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Simon Riggs
Date:
On 17 September 2012 07:44, Andres Freund <andres@2ndquadrant.com> wrote: > So I think while that bug had the possibility of being really bad we were > pretty lucky... Yes, agreed. The impact is not as severe as I originally thought. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes: > Btw, I played with this some more on Saturday and I think, while definitely a > bad bug, the actual consequences aren't as bad as at least I initially feared. > Fake relcache entries are currently set in 3 scenarios during recovery: > 1. removal of ALL_VISIBLE in heapam.c > 2. incomplete splits and incomplete deletions in nbtxlog.c > 3. incomplete splits in ginxlog.c > [ #1 doesn't really hurt in 9.1, and the others are low probability ] OK, that explains why we've not seen a blizzard of trouble reports. Still seems like a good idea to fix it ASAP, though. regards, tom lane
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Tom Lane
Date:
BTW, what should our advice be for recovering from corruption due to this bug? As far as the btree and GIN problems go, we can tell people that REINDEX will fix it. And in 9.1, you don't really need to worry about the visibility map being bad. But what do you do in 9.2, if you have a bad visibility map? Is there any fix short of VACUUM FULL? regards, tom lane
Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Simon Riggs
Date:
On 19 September 2012 18:47, Tom Lane <tgl@sss.pgh.pa.us> wrote: > BTW, what should our advice be for recovering from corruption due to > this bug? As far as the btree and GIN problems go, we can tell people > that REINDEX will fix it. And in 9.1, you don't really need to worry > about the visibility map being bad. But what do you do in 9.2, if > you have a bad visibility map? Is there any fix short of VACUUM FULL? SET vacuum_freeze_table_age = 0; VACUUM; I'm writing the full notes out now. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Andres Freund
Date:
On Monday, September 17, 2012 03:58:37 PM Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > Btw, I played with this some more on Saturday and I think, while > > definitely a bad bug, the actual consequences aren't as bad as at least > > I initially feared. > > > > Fake relcache entries are currently set in 3 scenarios during recovery: > > 1. removal of ALL_VISIBLE in heapam.c > > 2. incomplete splits and incomplete deletions in nbtxlog.c > > 3. incomplete splits in ginxlog.c > > [ #1 doesn't really hurt in 9.1, and the others are low probability ] > > OK, that explains why we've not seen a blizzard of trouble reports. > Still seems like a good idea to fix it ASAP, though. Btw, I think RhodiumToad/Andrew Gierth and I some time ago helped a user in the IRC Channel that had symptoms matching this bug. Situation was that he started to get very high IO and xid wraparound shutdown warnings due to never finishing and not canceleable autovacuums. After some investigation it turned out that btree indexes were processed at that time. We found they had cyclic btpo_next pointers leading to an endless loop in _bt_pagedel. We solved the issue by forcing leftsib = P_NONE inside the while (P_ISDELETED(opaque) || opaque->btpo_next != target) which let a queue DROP INDEX get the necessary locks. Unfortuantely this was on a busy production system with a nearing shutdown, so not much was kept for further diagnosis. After this bug was discovered I asked the user and indeed they previously shutdown the database twice in quick succession during heavy activity with -m immediate which could exactly lead to such a problem due to incompletely processed page splits. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Marko Tiikkaja
Date:
On 9/20/12 11:55 PM, Andres Freund wrote: > On Monday, September 17, 2012 03:58:37 PM Tom Lane wrote: >> OK, that explains why we've not seen a blizzard of trouble reports. >> Still seems like a good idea to fix it ASAP, though. > Btw, I think RhodiumToad/Andrew Gierth and I some time ago helped a user in the > IRC Channel that had symptoms matching this bug. Another such user reporting in. :-( Our slave started accumulating WAL files and ran out of disk space yesterday. After investigation from Andres and Andrew, it turns out that we were most likely hit by this very same bug. Here's what they have to say: "If the db crashes between logging the split and the parent-node insert, then in recovery, since relpersistence is not initialized correctly, when the recovery process tries to complete the operation, no xlog record is written for the insert. If there's a slave server, then the missing xlog record for the insert means that the slave's incomplete_actions queue never becomes empty, therefore the slave can no longer do recovery restartpoints." Some relevant information: [cur:92/314BC870, xid:76872047, rmid:10(Heap), len/tot_len:91/123, info:0, prev:92/314BB890] insert: s/d/r:1663/408841/415746 blk/off:13904/65 header: t_infomask2 8 t_infomask 2050 t_hoff 24 [cur:92/314BC8F0, xid:76872047, rmid:11(Btree), len/tot_len:702/734, info:64, prev:92/314BC870] split_r: s/d/r:1663/408841/475676 leftsib 2896 [cur:92/314BCBD0, xid:0, rmid:0(XLOG), len/tot_len:56/88, info:0, prev:92/314BC8F0] checkpoint: redo 146/314BCBD0; tli 1; nextxid 76872048; nextoid 764990; nextmulti 62062; nextoffset 132044; shutdown at 2012-09-11 14:26:26 CEST 2012-09-11 14:26:26.719 CEST,,,44620,,504f2df2.ae4c,5,,2012-09-11 14:26:26 CEST,,0,LOG,00000,"redo done at 92/314BC8F0",,,,,,,,"StartupXLOG, xlog.c:6641","" And apparently the relpersistence check in RelationNeedsWAL() call in _bt_insertonpg had a role in this as well. Regards, Marko Tiikkaja
Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Andres Freund
Date:
On Friday, September 21, 2012 03:30:31 PM Marko Tiikkaja wrote: > On 9/20/12 11:55 PM, Andres Freund wrote: > > On Monday, September 17, 2012 03:58:37 PM Tom Lane wrote: > >> OK, that explains why we've not seen a blizzard of trouble reports. > >> Still seems like a good idea to fix it ASAP, though. > > > > Btw, I think RhodiumToad/Andrew Gierth and I some time ago helped a user > > in the IRC Channel that had symptoms matching this bug. > > Another such user reporting in. :-( > > Our slave started accumulating WAL files and ran out of disk space > yesterday. After investigation from Andres and Andrew, it turns out > that we were most likely hit by this very same bug. > > Here's what they have to say: > "If the db crashes between logging the split and the parent-node insert, > then in recovery, since relpersistence is not initialized correctly, > when the recovery process tries to complete the operation, no xlog > record is written for the insert. If there's a slave server, then the > missing xlog record for the insert means that the slave's > incomplete_actions queue never becomes empty, therefore the slave can no > longer do recovery restartpoints." > > Some relevant information: > > [cur:92/314BC870, xid:76872047, rmid:10(Heap), ... insert: ... > [cur:92/314BC8F0, xid:76872047, rmid:11(Btree), ... split_r: ... > [cur:92/314BCBD0, xid:0, rmid:0(XLOG), len/tot_len:56/88, info:0, > prev:92/314BC8F0] checkpoint: redo 146/314BCBD0; ... shutdown > ... "redo done at 92/314BC8F0",,,,,,,,"StartupXLOG, xlog.c:6641","" Which means that an insert into the heap, triggered a btree split. At that point the database crashed. During recovery the split was supposed to be finished by the btree cleanup code. > And apparently the relpersistence check in RelationNeedsWAL() call in > _bt_insertonpg had a role in this as well. When detecting an incomplete split the nbtree cleanup code calls _bt_insert_parent, which calls _bt_insertonpg. Which finishes the split. BUT: it doesn't log that it finished because RelationNeedsWal() says it doesn't need to. That means: * indexes on stanbys will *definitely* be corrupted * a standby won't perform any restartpoints anymore till restarted * if the primary crashes corruption is likely. Hrm. I retract my earlier statement about the low likelihood of corruption due to this. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Robert Haas
Date:
On Fri, Sep 21, 2012 at 10:41 AM, Andres Freund <andres@2ndquadrant.com> wrote: > Hrm. I retract my earlier statement about the low likelihood of corruption due > to this. Yeah. :-( We've recently had at least one report of autovacuum failing to terminate due to a series of index pages forming a circular loop, and at least one case where it appears that the data became not-unique on a column upon which a unique index existed, in releases that contain this bug. It seems therefore that REINDEX + VACUUM with vacuum_freeze_table_age=0 is not quite sufficient to recover from this problem. If your index has come to contain a circularity, vacuum will fail to terminate, and you'll need to drop it completely to recover. And if you were relying on your index to enforce a unique constraint and it didn't, you'll need to do manual data repair before it will be possible to rebuild or replace that index. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Виктор Егоров
Date:
I'm afraid I'm exactly in this situation now.
--
Victor Y. Yegorov
Last entry from the 9.1.6 recommended VACUUM (FREEZE, VERBOSE, ANALYZE) was:
INFO: "meta_version_chunks": found 55363 removable, 32566245 nonremovable row versions in 450292 out of 450292 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 588315 unused item pointers.
0 pages are entirely empty.
CPU 2.44s/5.77u sec elapsed 2150.18 sec.
INFO: vacuuming "pg_toast.pg_toast_16582"
And here're are the locks held by the VACCUM backend:
select oid,relname,relkind,relpages,reltuples::numeric(15,0),reltoastrelid,reltoastidxid
from pg_class
where oid in (select relation from pg_locks where pid = 1380);
oid | relname | relkind | relpages | reltuples | reltoastrelid | reltoastidxid
-------+----------------------+---------+----------+-----------+---------------+---------------
16585 | pg_toast_16582 | t | 16460004 | 58161600 | 0 | 16587
16587 | pg_toast_16582_index | i | 188469 | 58161600 | 0 | 0
16582 | meta_version_chunks | r | 450292 | 32566200 | 16585 | 0
I will not touch anything and would like to get some recommendations on how to proceed.
2012/9/26 Robert Haas <robertmhaas@gmail.com>
On Fri, Sep 21, 2012 at 10:41 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> Hrm. I retract my earlier statement about the low likelihood of corruption due
> to this.
Yeah. :-(
We've recently had at least one report of autovacuum failing to
terminate due to a series of index pages forming a circular loop, and
at least one case where it appears that the data became not-unique on
a column upon which a unique index existed, in releases that contain
this bug.
It seems therefore that REINDEX + VACUUM with
vacuum_freeze_table_age=0 is not quite sufficient to recover from this
problem. If your index has come to contain a circularity, vacuum will
fail to terminate, and you'll need to drop it completely to recover.
And if you were relying on your index to enforce a unique constraint
and it didn't, you'll need to do manual data repair before it will be
possible to rebuild or replace that index.
Victor Y. Yegorov
Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Виктор Егоров
Date:
Forget to mention, that: - VACUUM is running on the master; - current state is unchanged for 20 hours. 2012/9/26 Виктор Егоров <vyegorov@gmail.com>: > I'm afraid I'm exactly in this situation now. > > Last entry from the 9.1.6 recommended VACUUM (FREEZE, VERBOSE, ANALYZE) was: > INFO: "meta_version_chunks": found 55363 removable, 32566245 nonremovable > row versions in 450292 out of 450292 pages > DETAIL: 0 dead row versions cannot be removed yet. > There were 588315 unused item pointers. > 0 pages are entirely empty. > CPU 2.44s/5.77u sec elapsed 2150.18 sec. > INFO: vacuuming "pg_toast.pg_toast_16582" -- Victor Y. Yegorov
Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Devrim GÜNDÜZ
Date:
Hi, On Tue, 2012-09-25 at 20:42 -0400, Robert Haas wrote: > It seems therefore that REINDEX + VACUUM with > vacuum_freeze_table_age=0 is not quite sufficient to recover from this > problem. If your index has come to contain a circularity, vacuum will > fail to terminate, and you'll need to drop it completely to recover. What is the difference between REINDEX and dropping and recreating index? (or say, creating the same index with another name, dropping old one, and renaming the new one with old one?) Regards, -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz
Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Andres Freund
Date:
Hi, On Wednesday, September 26, 2012 07:57:06 AM Виктор Егоров wrote: > I'm afraid I'm exactly in this situation now. :( > Last entry from the 9.1.6 recommended VACUUM (FREEZE, VERBOSE, ANALYZE) It recommended doing a REINDEX first though? I guess you didn't do that? > was: INFO: "meta_version_chunks": found 55363 removable, 32566245 > nonremovable row versions in 450292 out of 450292 pages > DETAIL: 0 dead row versions cannot be removed yet. > There were 588315 unused item pointers. > 0 pages are entirely empty. > CPU 2.44s/5.77u sec elapsed 2150.18 sec. > INFO: vacuuming "pg_toast.pg_toast_16582" > > And here're are the locks held by the VACCUM backend: > select > oid,relname,relkind,relpages,reltuples::numeric(15,0),reltoastrelid,reltoas > tidxid from pg_class > where oid in (select relation from pg_locks where pid = 1380); > oid | relname | relkind | relpages | reltuples | > reltoastrelid | reltoastidxid > -------+----------------------+---------+----------+-----------+----------- > ----+--------------- 16585 | pg_toast_16582 | t | 16460004 | > 58161600 | > 0 | 16587 > 16587 | pg_toast_16582_index | i | 188469 | 58161600 | > 0 | 0 > 16582 | meta_version_chunks | r | 450292 | 32566200 | > 16585 | 0 > > I will not touch anything and would like to get some recommendations on how > to proceed. On Wednesday, September 26, 2012 08:12:37 AM Виктор Егоров wrote: > Forget to mention, that: > - VACUUM is running on the master; > - current state is unchanged for 20 hours. I guess you cannot cancel the vacuum? Last time it was in a cycle without checking interrupts inbetween. Can you restart the server? Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
From
Виктор Егоров
Date:
You're right, REINDEX was not done. I've stopped the VACUUM, did a proper server restart (pg_ctl -m fast -w restart) and will work on rebuilding relations. Seems like I have another issue with a bunch of bloated tables on my way also. Thanks for the support. 2012/9/26 Andres Freund <andres@2ndquadrant.com>: >> Last entry from the 9.1.6 recommended VACUUM (FREEZE, VERBOSE, ANALYZE) > It recommended doing a REINDEX first though? I guess you didn't do that? > > ... > > I guess you cannot cancel the vacuum? Last time it was in a cycle without > checking interrupts inbetween. > > Can you restart the server? -- Victor Y. Yegorov