Thread: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
petr@petrovich.kiev.ua
Date:
The following bug has been logged on the website: Bug reference: 8673 Logged by: Serge Negodyuck Email address: petr@petrovich.kiev.ua PostgreSQL version: 9.3.2 Operating system: Linux 3.2.0-57-generic amd64. Ubuntu 12.04.3 Description: I'm using postgresql 9.3.0, with WAL streaming replication, i.e.: postgresql.conf: hot_standby = on recovery.conf: standby_mode = 'on' primary_conninfo = 'host=master port=5432 user=postgres password=password' I've installed new slave database on 6th of December. Since there was no packages on apt.postgresql.org with postgresql 9.3.0 I've set up postgresql 9.3.2 I see a lot of messages in slave server log like : 2013-12-09 10:10:24 EET 172.18.10.45 main ERROR: could not access status of transaction 24568845 2013-12-09 10:10:24 EET 172.18.10.45 main DETAIL: Could not open file "pg_multixact/members/CD8F": No such file or directory. -- 2013-12-09 10:11:59 EET 172.18.10.30 main ERROR: could not access status of transaction 31724554 2013-12-09 10:11:59 EET 172.18.10.30 main DETAIL: Could not open file "pg_multixact/members/F615": No such file or directory. -- 2013-12-09 10:12:20 EET 172.18.10.7 main ERROR: could not access status of transaction 25496296 2013-12-09 10:12:20 EET 172.18.10.7 main DETAIL: Could not open file "pg_multixact/members/D31E": No such file or directory. My next step was to upgrade to postgresql 9.3.2 on master and to do initial sync from scratch. It does not help. I still have the same error. I think it may be tied to this commit: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=215ac4ad6589e0f6a31cc4cd867aedba3cd42924 Next question why binary packages for 9.3.0/9.3.1 where removed from apt.postgresql.org so I cannot downgrade.
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Andres Freund
Date:
Hi, On 2013-12-09 13:47:34 +0000, petr@petrovich.kiev.ua wrote: > PostgreSQL version: 9.3.2 > I've installed new slave database on 6th of December. Since there was no > packages on apt.postgresql.org with postgresql 9.3.0 I've set up postgresql > 9.3.2 > 2013-12-09 10:10:24 EET 172.18.10.45 main ERROR: could not access status of > transaction 24568845 > 2013-12-09 10:10:24 EET 172.18.10.45 main DETAIL: Could not open file > "pg_multixact/members/CD8F": No such file or directory. > My next step was to upgrade to postgresql 9.3.2 on master and to do initial > sync from scratch. > It does not help. I still have the same error. Could you post, as close as possible to the next occurance of that error: * pg_controldata output from the primary * pg_controldata output from the standby * SELECT datfrozenxid, datminmxid FROM pg_database; Do you frequently VACUUM FREEZE on the primary? Have you modified any of the vacuum_freeze_* parameters? > I think it may be tied to this commit: > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=215ac4ad6589e0f6a31cc4cd867aedba3cd42924 Only incidentally I think - we didn't properly maintain it during recovery before at all. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Andres Freund
Date:
On 2013-12-09 16:55:21 +0200, Serge Negodyuck wrote: > 2013/12/9 Andres Freund <andres@2ndquadrant.com>: > > On 2013-12-09 13:47:34 +0000, petr@petrovich.kiev.ua wrote: > >> PostgreSQL version: 9.3.2 > > > >> I've installed new slave database on 6th of December. Since there was no > >> packages on apt.postgresql.org with postgresql 9.3.0 I've set up postgresql > >> 9.3.2 > > > >> 2013-12-09 10:10:24 EET 172.18.10.45 main ERROR: could not access status of > >> transaction 24568845 > >> 2013-12-09 10:10:24 EET 172.18.10.45 main DETAIL: Could not open file > >> "pg_multixact/members/CD8F": No such file or directory. > > > >> My next step was to upgrade to postgresql 9.3.2 on master and to do initial > >> sync from scratch. > >> It does not help. I still have the same error. > > > > Could you post, as close as possible to the next occurance of that > > error: > > * pg_controldata output from the primary > > * pg_controldata output from the standby > > Sorry, I've just downgraded all the cluster to 9.3.0, and this error > disappeared. > I can provide output right now, if it make any sence. Yes, that'd be helpful. Could you also provide ls -l pg_multixact/ on both primary and standby? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Serge Negodyuck
Date:
2013/12/9 Andres Freund <andres@2ndquadrant.com>: > Hi, > > On 2013-12-09 13:47:34 +0000, petr@petrovich.kiev.ua wrote: >> PostgreSQL version: 9.3.2 > >> I've installed new slave database on 6th of December. Since there was no >> packages on apt.postgresql.org with postgresql 9.3.0 I've set up postgresql >> 9.3.2 > >> 2013-12-09 10:10:24 EET 172.18.10.45 main ERROR: could not access status of >> transaction 24568845 >> 2013-12-09 10:10:24 EET 172.18.10.45 main DETAIL: Could not open file >> "pg_multixact/members/CD8F": No such file or directory. > >> My next step was to upgrade to postgresql 9.3.2 on master and to do initial >> sync from scratch. >> It does not help. I still have the same error. > > Could you post, as close as possible to the next occurance of that > error: > * pg_controldata output from the primary > * pg_controldata output from the standby Sorry, I've just downgraded all the cluster to 9.3.0, and this error disappeared. I can provide output right now, if it make any sence. But the same query with the same id *always* gave the same error not depending on replication state. One more thing: I had a look on previous fay logs and find following: 2013-12-08 21:17:15 EET LOG: could not truncate directory "pg_multixact/members": apparent wraparound 2013-12-08 21:22:20 EET LOG: could not truncate directory "pg_multixact/members": apparent wraparound 2013-12-08 21:27:26 EET LOG: could not truncate directory "pg_multixact/members": apparent wraparound AT some point at time there was no "apparent wraparound" message anymore, but just "Could not open file" > * SELECT datfrozenxid, datminmxid FROM pg_database; datfrozenxid | datminmxid --------------+------------ 710 | 1 710 | 1 710 | 1 710 | 1 > > Do you frequently VACUUM FREEZE on the primary? Have you modified any of > the vacuum_freeze_* parameters? I never run VACUUM FREEZE manually The only changed vacuum_* parameter is vacuum_cost_delay = 10 > >> I think it may be tied to this commit: >> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=215ac4ad6589e0f6a31cc4cd867aedba3cd42924 > > Only incidentally I think - we didn't properly maintain it during > recovery before at all. >
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Andres Freund
Date:
On 2013-12-09 17:49:34 +0200, Serge Negodyuck wrote: > Latest checkpoint's NextXID: 0/90546484 > Latest checkpoint's NextOID: 6079185 > Latest checkpoint's NextMultiXactId: 42049949 > Latest checkpoint's NextMultiOffset: 55384024 > Latest checkpoint's oldestXID: 710 > Latest checkpoint's oldestXID's DB: 1 > Latest checkpoint's oldestActiveXID: 90546475 > Latest checkpoint's oldestMultiXid: 1 > Latest checkpoint's oldestMulti's DB: 1 So, the oldest multi is 1, thus the truncation code wouldn't have been able to remove any. So I think this really is an independent problem from the truncation patch. But a problem nonetheless. > > Could you also provide ls -l pg_multixact/ on both primary and standby? > > Did you mean pg_multixact/members/ ? From both members, and offset. I'd be great if you could attach a ls -lR of pg_multixact/. > I't not possible on slave right now. Since I had to re-sync these > files from master. May be that was not a good idea but it helped. > > On master there are files from 0000 to 14078 > > On slave there were absent files from A1xx to FFFF > They were the oldest ones. (October, November) Hm. Were files from before A1xx, I am not sure how to parse your answer. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Serge Negodyuck
Date:
2013/12/9 Andres Freund <andres@2ndquadrant.com>: > On 2013-12-09 16:55:21 +0200, Serge Negodyuck wrote: >> 2013/12/9 Andres Freund <andres@2ndquadrant.com>: >> > On 2013-12-09 13:47:34 +0000, petr@petrovich.kiev.ua wrote: >> >> PostgreSQL version: 9.3.2 >> > >> >> I've installed new slave database on 6th of December. Since there was no >> >> packages on apt.postgresql.org with postgresql 9.3.0 I've set up postgresql >> >> 9.3.2 >> > >> >> 2013-12-09 10:10:24 EET 172.18.10.45 main ERROR: could not access status of >> >> transaction 24568845 >> >> 2013-12-09 10:10:24 EET 172.18.10.45 main DETAIL: Could not open file >> >> "pg_multixact/members/CD8F": No such file or directory. >> > >> >> My next step was to upgrade to postgresql 9.3.2 on master and to do initial >> >> sync from scratch. >> >> It does not help. I still have the same error. >> > >> > Could you post, as close as possible to the next occurance of that >> > error: >> > * pg_controldata output from the primary >> > * pg_controldata output from the standby >> >> Sorry, I've just downgraded all the cluster to 9.3.0, and this error >> disappeared. >> I can provide output right now, if it make any sence. > > Yes, that'd be helpful. master: pg_control version number: 937 Catalog version number: 201306121 Database system identifier: 5928279687159054327 Database cluster state: in production pg_control last modified: Mon 09 Dec 2013 05:29:53 PM EET Latest checkpoint location: 3D4/76E97DA0 Prior checkpoint location: 3D4/6E768638 Latest checkpoint's REDO location: 3D4/76925C18 Latest checkpoint's REDO WAL file: 00000001000003D400000076 Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0/90546484 Latest checkpoint's NextOID: 6079185 Latest checkpoint's NextMultiXactId: 42049949 Latest checkpoint's NextMultiOffset: 55384024 Latest checkpoint's oldestXID: 710 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 90546475 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Time of latest checkpoint: Mon 09 Dec 2013 05:29:44 PM EET Fake LSN counter for unlogged rels: 0/1 Minimum recovery ending location: 0/0 Min recovery ending loc's timeline: 0 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no Current wal_level setting: hot_standby Current max_connections setting: 1000 Current max_prepared_xacts setting: 0 Current max_locks_per_xact setting: 64 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 slave: pg_control version number: 937 Catalog version number: 201306121 Database system identifier: 5928279687159054327 Database cluster state: in archive recovery pg_control last modified: Mon 09 Dec 2013 05:25:22 PM EET Latest checkpoint location: 3D4/6E768638 Prior checkpoint location: 3D4/66F14C60 Latest checkpoint's REDO location: 3D4/6E39F9E8 Latest checkpoint's REDO WAL file: 00000001000003D40000006E Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0/90546484 Latest checkpoint's NextOID: 6079185 Latest checkpoint's NextMultiXactId: 42046170 Latest checkpoint's NextMultiOffset: 55058098 Latest checkpoint's oldestXID: 710 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 90541410 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Time of latest checkpoint: Mon 09 Dec 2013 05:24:44 PM EET Fake LSN counter for unlogged rels: 0/1 Minimum recovery ending location: 3D4/7884BB68 Min recovery ending loc's timeline: 1 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no Current wal_level setting: hot_standby Current max_connections setting: 1000 Current max_prepared_xacts setting: 0 Current max_locks_per_xact setting: 64 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 > > Could you also provide ls -l pg_multixact/ on both primary and standby? Did you mean pg_multixact/members/ ? I't not possible on slave right now. Since I had to re-sync these files from master. May be that was not a good idea but it helped. On master there are files from 0000 to 14078 On slave there were absent files from A1xx to FFFF They were the oldest ones. (October, November)
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Andres Freund
Date:
Hi, On 2013-12-09 17:49:34 +0200, Serge Negodyuck wrote: > On master there are files from 0000 to 14078 > > On slave there were absent files from A1xx to FFFF > They were the oldest ones. (October, November) Some analysis later, I am pretty sure that the origin is a longstanding problem and not connected to 9.3.[01] vs 9.3.2. The above referenced 14078 file is exactly the last page before a members wraparound: (gdb) p/x (1L<<32)/(MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT) $10 = 0x14078 So, what happened is that enough multixacts where created, that the members slru wrapped around. It's not unreasonable for the members slru to wrap around faster then the offsets one - after all we create at least two entries into members for every offset entry. Also in 9.3+ there fit more xids on a offset than a members page. When truncating, we first read the offset, to know where we currently are in members, and then truncate both from their respective point. Since we've wrapped around in members we very well might remove content we actually need. I've recently remarked that I find it dangerous that we only do anti-wraparound stuff for pg_multixact/offsets, not for /members. So, here we have the proof that that's bad. This is an issue in <9.3 as well. It might, in some sense, even be worse there, because we never vacuum old multis away. But on the other hand, the growths of multis is slower there and we look into old multis less frequently. The only reason that you saw the issue on the standby first is that the truncation code is called more frequently there. Afaics it will happen, sometime in the future, on the master as well. I think problems should be preventable if you issue a systemwide VACUUM FREEZE, but please let others chime in before you execute it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Andres Freund wrote: > Hi, > > On 2013-12-09 17:49:34 +0200, Serge Negodyuck wrote: > > On master there are files from 0000 to 14078 > > > > On slave there were absent files from A1xx to FFFF > > They were the oldest ones. (October, November) > > Some analysis later, I am pretty sure that the origin is a longstanding > problem and not connected to 9.3.[01] vs 9.3.2. > > The above referenced 14078 file is exactly the last page before a > members wraparound: > (gdb) p/x (1L<<32)/(MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT) > $10 = 0x14078 As a note, the SlruScanDirectory code has a flaw because it only looks at four-digit files; the reason only files up to 0xFFFF are missing and not the following ones is because those got ignored. This needs a fix as well. > So, what happened is that enough multixacts where created, that the > members slru wrapped around. It's not unreasonable for the members slru > to wrap around faster then the offsets one - after all we create at > least two entries into members for every offset entry. Also in 9.3+ > there fit more xids on a offset than a members page. > When truncating, we first read the offset, to know where we currently > are in members, and then truncate both from their respective > point. Since we've wrapped around in members we very well might remove > content we actually need. Yeah, on 9.3 each member Xid occupies five bytes in pg_multixact/members, whereas each offset only occupies four bytes in pg_multixact/offsets. It's rare that a multixact only contains one member; typically they will have at least two (so for each multixact we would have 4 bytes in offsets and a minimum of 10 bytes in members). So wrapping around is easy for members, even with the protections we have in place for offsets. > I've recently remarked that I find it dangerous that we only do > anti-wraparound stuff for pg_multixact/offsets, not for /members. So, > here we have the proof that that's bad. It's hard to see how to add this post-facto, though. I mean, I am thinking we would need some additional pg_control info etc. We'd better figure out a way to add such controls without having to add that. > This is an issue in <9.3 as well. It might, in some sense, even be worse > there, because we never vacuum old multis away. But on the other hand, > the growths of multis is slower there and we look into old multis less > frequently. > > The only reason that you saw the issue on the standby first is that the > truncation code is called more frequently there. Afaics it will happen, > sometime in the future, on the master as well. > > I think problems should be preventable if you issue a systemwide VACUUM > FREEZE, but please let others chime in before you execute it. I wouldn't freeze anything just yet, at least until the patch to fix multixact freezing is in. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Andres Freund
Date:
On 2013-12-09 16:00:32 -0300, Alvaro Herrera wrote: > As a note, the SlruScanDirectory code has a flaw because it only looks > at four-digit files; the reason only files up to 0xFFFF are missing and > not the following ones is because those got ignored. This needs a fix > as well. While I agree it's a bug, I don't think it's relevant for the case at hand. For offset's there's no following page (or exactly 1, not sure about the math offhand), and we only use SlruScanDirectory() for offsets not for members. > > I've recently remarked that I find it dangerous that we only do > > anti-wraparound stuff for pg_multixact/offsets, not for /members. So, > > here we have the proof that that's bad. > > It's hard to see how to add this post-facto, though. I mean, I am > thinking we would need some additional pg_control info etc. We'd better > figure out a way to add such controls without having to add that. Couldn't we just get the oldest multi, check where in offsets it points to, and compare that with nextOffset? That should be doable without additional data. > > I think problems should be preventable if you issue a systemwide VACUUM > > FREEZE, but please let others chime in before you execute it. > > I wouldn't freeze anything just yet, at least until the patch to fix > multixact freezing is in. Well, it seems better than getting errors because of multixact members that are gone. Maybe PGOPTIONS='-c vacuum_freez_table_age=0 -c vacuum_freeze_min_age=1000000 vacuumdb -a' - that ought not to cause problems with current data and should freeze enough to get rid of problematic multis? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Andres Freund wrote: > On 2013-12-09 16:00:32 -0300, Alvaro Herrera wrote: > > As a note, the SlruScanDirectory code has a flaw because it only looks > > at four-digit files; the reason only files up to 0xFFFF are missing and > > not the following ones is because those got ignored. This needs a fix > > as well. > > While I agree it's a bug, I don't think it's relevant for the case at > hand. For offset's there's no following page (or exactly 1, not sure > about the math offhand), and we only use SlruScanDirectory() for > offsets not for members. Sure we do for members, through SimpleLruTruncate which calls SlruScanDirectory underneath. > > > I've recently remarked that I find it dangerous that we only do > > > anti-wraparound stuff for pg_multixact/offsets, not for /members. So, > > > here we have the proof that that's bad. > > > > It's hard to see how to add this post-facto, though. I mean, I am > > thinking we would need some additional pg_control info etc. We'd better > > figure out a way to add such controls without having to add that. > > Couldn't we just get the oldest multi, check where in offsets it points > to, and compare that with nextOffset? That should be doable without > additional data. Hmm, that seems a sensible approach ... > > > I think problems should be preventable if you issue a systemwide VACUUM > > > FREEZE, but please let others chime in before you execute it. > > > > I wouldn't freeze anything just yet, at least until the patch to fix > > multixact freezing is in. > > Well, it seems better than getting errors because of multixact members > that are gone. > Maybe PGOPTIONS='-c vacuum_freez_table_age=0 -c vacuum_freeze_min_age=1000000 vacuumdb -a' > - that ought not to cause problems with current data and should freeze > enough to get rid of problematic multis? TBH I don't feel comfortable with predicting what will it freeze with the broken code. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Serge Negodyuck
Date:
> From both members, and offset. I'd be great if you could attach a ls -lR > of pg_multixact/. > I've set up another slave (in not production mode) to reproduce this bug and provide more details if necessary. There are no "apparent wraparound" message in logs yet. Probably because I cannot sent the same SQL queries to non-production server. Just to provide exact information: these are links to ls -lR pg_multixact/ of both master and slave server. https://drive.google.com/file/d/0B5sVzeuionyGVER6dnA3a0NkOU0/ https://drive.google.com/file/d/0B5sVzeuionyGUlFibC1jZEVvQzA/
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Serge Negodyuck
Date:
2013/12/10 Alvaro Herrera <alvherre@2ndquadrant.com>: > Andres Freund wrote: > >> > > I think problems should be preventable if you issue a systemwide VACUUM >> > > FREEZE, but please let others chime in before you execute it. >> > >> > I wouldn't freeze anything just yet, at least until the patch to fix >> > multixact freezing is in. >> >> Well, it seems better than getting errors because of multixact members >> that are gone. >> Maybe PGOPTIONS='-c vacuum_freez_table_age=0 -c vacuum_freeze_min_age=1000000 vacuumdb -a' >> - that ought not to cause problems with current data and should freeze >> enough to get rid of problematic multis? > > TBH I don't feel comfortable with predicting what will it freeze with > the broken code. > You guys were right. After a week this issue occured again on almost all slave servers. slave: 2013-12-17 14:21:20 MSK CONTEXT: xlog redo delete: index 1663/16516/5320124; iblk 8764, heap 1663/16516/18816; 2013-12-17 14:21:20 MSK LOG: file "pg_clog/0370" doesn't exist, reading as zeroes 2013-12-17 14:21:20 MSK FATAL: MultiXactId 1819308905 has not been created yet -- apparent wraparound 2013-12-17 14:21:20 MSK CONTEXT: xlog redo delete: index 1663/16516/5320124; iblk 8764, heap 1663/16516/18816; 2013-12-17 14:21:20 MSK LOG: startup process (PID 13622) exited with exit code 1 I had to do fix something o master since all slaves were affected. So the only idea was do perform VACUUM FREEZE on master. I believe that was not a good idea. I suppose "vacuum freeze" leaded to following errors on master: 2013-12-17 13:15:34 EET 172.18.10.44 ruprom ERROR: could not access status of transaction 8407326 2013-12-17 13:15:34 EET 172.18.10.44 ruprom DETAIL: Could not open file "pg_multixact/members/A458": No such file or directory. The only way out was to perform full backup/restore, which did not succeed with teh same error (could not access status of transaction xxxxxxx) A very ugly hack was to copy pg_multixact/members/0000 -> pg_multixact/members/[ABCDF]xxx, it helped to do full backup, but not sure about consistency of data. My question is are there any quick-and-dirty solution to disable pg_multixact deletion? I understand it may lead to waste of space.
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Andres Freund
Date:
Hi, On 2013-12-18 16:20:48 +0200, Serge Negodyuck wrote: > You guys were right. After a week this issue occured again on almost > all slave servers. Not surprising. > My question is are there any quick-and-dirty solution to disable > pg_multixact deletion? I understand it may lead to waste of space. Your problem isn't the deletion but that a) multixactid members wrapped around, partially overwriting old data b) some multixacts contain too old members (c.f. 9.3.2) > The only way out was to perform full backup/restore, which did not > succeed with teh same error (could not access status of transaction > xxxxxxx) > A very ugly hack was to copy pg_multixact/members/0000 -> > pg_multixact/members/[ABCDF]xxx, it helped to do full backup, but not > sure about consistency of data. That will likely cause corruption. I think your best bet is to hack GetMultiXactIdMembers to return -1 for old multixacts, on the theory that those rows would have been removed by VACUUM if they had a valid xmax. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Andres Freund wrote: > On 2013-12-09 17:49:34 +0200, Serge Negodyuck wrote: > > On master there are files from 0000 to 14078 > > > > On slave there were absent files from A1xx to FFFF > > They were the oldest ones. (October, November) > > Some analysis later, I am pretty sure that the origin is a longstanding > problem and not connected to 9.3.[01] vs 9.3.2. > > The above referenced 14078 file is exactly the last page before a > members wraparound: > (gdb) p/x (1L<<32)/(MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT) > $10 = 0x14078 > > So, what happened is that enough multixacts where created, that the > members slru wrapped around. It's not unreasonable for the members slru > to wrap around faster then the offsets one - after all we create at > least two entries into members for every offset entry. Also in 9.3+ > there fit more xids on a offset than a members page. > When truncating, we first read the offset, to know where we currently > are in members, and then truncate both from their respective > point. Since we've wrapped around in members we very well might remove > content we actually need. > > I've recently remarked that I find it dangerous that we only do > anti-wraparound stuff for pg_multixact/offsets, not for /members. So, > here we have the proof that that's bad. I have applied three patches to deal with some of the problems reported here, and others discovered during the investigation of them. One of them was about failing to truncate files beyond FFFF. That should work fine now -- that is, you would lose more data. Unless we consider a second fix, which is that files corresponding to data still in use are no longer truncated. I had to include the third fix (to enable the system to wrap around sanely from file 14078 to 0000) was necessary so I could reproduce the issues. In systems with assertions enabled, there is a crash at the point of overflow. I didn't try, but since your system appears to have wrapped around I imagine it sort-of works in systems compiled without assertions (which is the recommended setting for production settings.) One thing not yet patched is overrun of members' SLRU: if you have enough live multixacts with enough members, creating a new one might overwrite the members area used by an older member. Freezing multis earlier would help with that. With the default settings, where multis are frozen when they are 50 million multis old and pages are 8kB long, there is room for 85 members per multi on average without such overrun[*]. I was able to observe this overrun by running Andres' pg_multixact_burn with each multixact having 100 members. I doubt it's common to have that many members in each multixact on average, but it's certainly a possibility. [*] There are 82040 files, having 32 pages each; each page has room for 1636 members. (82040 * 32 / 1636) / 50000000 =~ 85. One complaint somebody might rightly have about this is the space consumption by pg_multixact/ files. Perhaps instead of using Xid's freezing horizon verbatim, we should use ceil(min_freeze_age^0.8) or something like that; so for the default 50000000 Xid freezing age, we would freeze multis over 1.44 million multis old. (We could determine ratio of xid to multi usage so that they would both freeze when the same time has lapsed, but this seems unnecessarily complicated.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Serge Negodyuck
Date:
Hello, I've upgraded postgresql to version 9.3.4 and did fresh initdb and restored database from sql backup. According to 9.4.3 changelog issue with multixact wraparound was fixed. After a month it seems I have almost the same issue. ls -lat pg_multixact/members -rw------- 1 postgres postgres 262144 Jun 2 15:37 000D -rw------- 1 postgres postgres 262144 Jun 2 14:46 000C -rw------- 1 postgres postgres 262144 Jun 2 14:46 000B -rw------- 1 postgres postgres 262144 Jun 2 14:46 000A -rw------- 1 postgres postgres 262144 Jun 2 14:46 0009 -rw------- 1 postgres postgres 262144 Jun 2 13:22 0008 -rw------- 1 postgres postgres 262144 Jun 2 13:22 0007 -rw------- 1 postgres postgres 262144 Jun 2 12:54 0006 -rw------- 1 postgres postgres 262144 Jun 2 12:49 0005 -rw------- 1 postgres postgres 262144 Jun 2 12:43 0004 -rw------- 1 postgres postgres 262144 Jun 2 12:33 0003 -rw------- 1 postgres postgres 262144 Jun 2 12:23 0002 -rw------- 1 postgres postgres 262144 Jun 2 12:10 0001 -rw------- 1 postgres postgres 262144 Jun 2 10:37 0000 -rw------- 1 postgres postgres 262144 Jun 2 08:44 14078 -rw------- 1 postgres postgres 262144 Jun 2 08:44 14077 -rw------- 1 postgres postgres 262144 Jun 2 08:20 14076 /var/log/postgresql/postgresql.log: 2014-06-02 08:20:55 EEST 172.18.10.4 db PANIC: could not access status of transaction 2080547 2014-06-02 08:20:55 EEST 172.18.10.4 db DETAIL: Could not open file "pg_multixact/members/14078": No such file or directory. 2014-06-02 08:20:55 EEST 172.18.10.4 db CONTEXT: SQL statement " UPDATE ....." 2014-06-02 08:22:30 EEST LOG: server process (PID 51704) was terminated by signal 6: Aborted ....... 2014-06-02 08:22:30 EEST LOG: all server processes terminated; reinitializing 2014-06-02 08:22:30 EEST LOG: database system was interrupted; last known up at 2014-06-02 08:22:16 EEST 2014-06-02 08:22:30 EEST LOG: database system was not properly shut down; automatic recovery in progress 2014-06-02 08:22:30 EEST LOG: redo starts at 55/26D22820 2014-06-02 08:22:30 EEST 172.18.10.17 db FATAL: the database system is in recovery mode 2014-06-02 08:22:30 EEST FATAL: could not access status of transaction 2080547 2014-06-02 08:22:30 EEST DETAIL: Could not read from file "pg_multixact/members/14078" at offset 24576: Success. 2014-06-02 08:22:30 EEST CONTEXT: xlog redo create mxid 2080547 offset 4294961608 nmembers 8684: 6193231 (keysh) 6193233 (fornokeyupd) 6193234 (keysh) 6193235 (fornokeyupd) 6193236 (keysh) 6193237 (fornokeyupd) 6193238 (keysh) 6193239 (fornokeyupd) 6193240 (keysh) 6193241 (fornokeyupd) 6193242 (keysh) 6193243 (fornokeyupd) 6193244 (keysh) 6193245 (fornokeyupd) 6193246 (keysh) 6193247 (fornokeyupd) 6193248 (keysh) 6193249 (fornokeyupd) 6193250 (keysh) 6193251 (fornokeyupd) 6193252 (keysh) 6193253 (fornokeyupd) 6193254 (keysh) 6193255 (fornokeyupd) 6193256 (keysh) 6193257 ....... 2014-06-02 08:27:40 EEST LOG: startup process (PID 55275) exited with exit code 1 2014-06-02 08:27:40 EEST LOG: aborting startup due to startup process failure 2014-06-02 08:32:46 EEST LOG: database system was interrupted while in recovery at 2014-06-02 08:27:39 EEST 2014-06-02 08:32:46 EEST HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery. 2014-06-02 08:32:46 EEST LOG: database system was not properly shut down; automatic recovery in progress 2014-06-02 08:32:46 EEST LOG: redo starts at 55/26D22820 2014-06-02 08:32:46 EEST FATAL: could not access status of transaction 2080547 2014-06-02 08:32:46 EEST DETAIL: Could not read from file "pg_multixact/members/14078" at offset 24576: Success. 2014-06-02 08:32:46 EEST CONTEXT: xlog redo create mxid 2080547 offset 4294961608 nmembers 8684: 6193231 (keysh) .... ... And so on. An ugly hack "cp pg_multixact/members/14077 pg_multixact/members/14078" helped me to start master server in replica. Then, did pg_basebackup to slave database. It does not help 2014-06-02 09:58:49 EEST 172.18.10.17 db2 DETAIL: Could not open file "pg_multixact/members/1112D": No such file or directory. 2014-06-02 09:58:49 EEST 172.18.10.18 db2 DETAIL: Could not open file "pg_multixact/members/11130": No such file or directory. 2014-06-02 09:58:51 EEST 172.18.10.34 db2 DETAIL: Could not open file "pg_multixact/members/11145": No such file or directory. 2014-06-02 09:58:51 EEST 172.18.10.38 db2 DETAIL: Could not open file "pg_multixact/members/13F76": No such file or directory What additional information should I provide? If I will increase autovacuum_multixact_freeze_max_age will it help? (Now I have default value)
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Serge Negodyuck wrote: > Hello, > > I've upgraded postgresql to version 9.3.4 and did fresh initdb and restored > database from sql backup. > According to 9.4.3 changelog issue with multixact wraparound was fixed. Ouch. This is rather strange. First I see the failing multixact has 8684 members, which is totally unusual. My guess is that you have code that creates lots of subtransactions, and perhaps does something to one tuple in a different subtransaction; doing sometihng like that would be, I think, the only way to get subxacts that large. Does that sound right? > An ugly hack "cp pg_multixact/members/14077 pg_multixact/members/14078" > helped me to start master server in replica. This is the second weird thing. How come you needed to create 14078 when you already had that file, according to the listing you pasted above? Or is the 14078 file present in the listing only because you copied it? > Then, did pg_basebackup to slave database. It does not help > 2014-06-02 09:58:49 EEST 172.18.10.17 db2 DETAIL: Could not open file > "pg_multixact/members/1112D": No such file or directory. > 2014-06-02 09:58:49 EEST 172.18.10.18 db2 DETAIL: Could not open file > "pg_multixact/members/11130": No such file or directory. > 2014-06-02 09:58:51 EEST 172.18.10.34 db2 DETAIL: Could not open file > "pg_multixact/members/11145": No such file or directory. > 2014-06-02 09:58:51 EEST 172.18.10.38 db2 DETAIL: Could not open file > "pg_multixact/members/13F76": No such file or directory Are these the only files missing? Are intermediate files there? > What additional information should I provide? > If I will increase autovacuum_multixact_freeze_max_age will it help? (Now I > have default value) No, you'd need to *decrease* the freezing age, so that things are cleaned up sooner. If you increase the freezing age, the problematic multis would persist longer, causing more trouble. I think I will have to reproduce your scenario to be able to understand what is going on. If you can describe it in detail, that would make it easier for me. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Serge Negodyuck
Date:
2014-06-02 17:10 GMT+03:00 Alvaro Herrera <alvherre@2ndquadrant.com>: > Serge Negodyuck wrote: > > Hello, > > > > I've upgraded postgresql to version 9.3.4 and did fresh initdb and > restored > > database from sql backup. > > According to 9.4.3 changelog issue with multixact wraparound was fixed. > > Ouch. This is rather strange. First I see the failing multixact has > 8684 members, which is totally unusual. My guess is that you have code > that creates lots of subtransactions, and perhaps does something to one > tuple in a different subtransaction; doing sometihng like that would be, > I think, the only way to get subxacts that large. Does that sound > right? > > It sounds like you are right. I've found a lot of inserts in logs. Each insert cause trigger to be performed. This trigger updates counter in other table. It is very possible this tirgger tries to update the same counter for different inserts. > > An ugly hack "cp pg_multixact/members/14077 pg_multixact/members/14078" > > helped me to start master server in replica. > > This is the second weird thing. How come you needed to create 14078 > when you already had that file, according to the listing you pasted > above? Or is the 14078 file present in the listing only because you > copied it? > 14078 file present in the listing only because I copied it. It was the only corrupt file on master server. > > > Then, did pg_basebackup to slave database. It does not help > > 2014-06-02 09:58:49 EEST 172.18.10.17 db2 DETAIL: Could not open file > > "pg_multixact/members/1112D": No such file or directory. > > 2014-06-02 09:58:49 EEST 172.18.10.18 db2 DETAIL: Could not open file > > "pg_multixact/members/11130": No such file or directory. > > 2014-06-02 09:58:51 EEST 172.18.10.34 db2 DETAIL: Could not open file > > "pg_multixact/members/11145": No such file or directory. > > 2014-06-02 09:58:51 EEST 172.18.10.38 db2 DETAIL: Could not open file > > "pg_multixact/members/13F76": No such file or directory > > Are these the only files missing? Are intermediate files there? > Only 0000 - 001E files were present on slave server. > > > What additional information should I provide? > > If I will increase autovacuum_multixact_freeze_max_age will it help? > (Now I > > have default value) > > No, you'd need to *decrease* the freezing age, so that things are > cleaned up sooner. If you increase the freezing age, the problematic > multis would persist longer, causing more trouble. > > > I think I will have to reproduce your scenario to be able to understand > what is going on. If you can describe it in detail, that would make it > easier for me. > Try this simplifiled schema: CREATE TABLE company ( id integer NOT NULL, num_products integer ); CREATE TABLE product ( id integer NOT NULL, company_id integer ); ALTER TABLE ONLY company ADD CONSTRAINT company_pkey PRIMARY KEY (id); ALTER TABLE ONLY product ADD CONSTRAINT product_pkey PRIMARY KEY (id); ALTER TABLE ONLY product ADD CONSTRAINT product_company_id_fk FOREIGN KEY (company_id) REFERENCES company(id) ON DELETE CASCADE; CREATE FUNCTION update_num_products() RETURNS trigger LANGUAGE plpythonu AS $$ old, new, event, when, table_name = TD["old"], TD["new"], TD["event"], TD["when"], TD["table_name"] def update_company(company_id, delta): plpy.execute(""" UPDATE company SET num_products = coalesce(num_products, 0) + %(delta)s WHERE id = %(company_id)s """ % {'company_id': company_id, 'delta': delta}) if table_name.lower() == 'product': if when == 'AFTER' and event == 'INSERT': if new['status'] == 0 and new['visible_on_site'] is not False: update_company(new['company_id'], 1) return 'OK' $$; CREATE TRIGGER update_num_products AFTER INSERT OR UPDATE ON product FOR EACH ROW EXECUTE PROCEDURE update_num_products(); INSERT into company VALUES (0,1); Do a lot of inserts into product with the same product.company_id;
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Serge Negodyuck wrote: > 2014-06-02 17:10 GMT+03:00 Alvaro Herrera <alvherre@2ndquadrant.com>: > > > Serge Negodyuck wrote: > > > Hello, > > > > > > I've upgraded postgresql to version 9.3.4 and did fresh initdb and > > restored > > > database from sql backup. > > > According to 9.4.3 changelog issue with multixact wraparound was fixed. > > > > Ouch. This is rather strange. First I see the failing multixact has > > 8684 members, which is totally unusual. My guess is that you have code > > that creates lots of subtransactions, and perhaps does something to one > > tuple in a different subtransaction; doing sometihng like that would be, > > I think, the only way to get subxacts that large. Does that sound > > right? > > > It sounds like you are right. I've found a lot of inserts in logs. Each > insert cause trigger to be performed. This trigger updates counter in > other table. > It is very possible this tirgger tries to update the same counter for > different inserts. I wasn't able to reproduce it that way, but I eventually figured out that if I altered altered the plpython function to grab a FOR NO KEY UPDATE lock first, insertion would grow the multixact beyond reasonable limits; see the attachment. If you then INSERT many tuples in "product" in a single transaction, the resulting xmax is a Multixact that has as many members as inserts there are, plus one. (One variation that causes even more bizarre results is dispensing with the plpy.subtransaction() in the function and instead setting a savepoint before each insert. In fact, given the multixact members shown in your log snippet I think that's more similar to what you code does.) > > > Then, did pg_basebackup to slave database. It does not help > > > 2014-06-02 09:58:49 EEST 172.18.10.17 db2 DETAIL: Could not open file > > > "pg_multixact/members/1112D": No such file or directory. > > > 2014-06-02 09:58:49 EEST 172.18.10.18 db2 DETAIL: Could not open file > > > "pg_multixact/members/11130": No such file or directory. > > > 2014-06-02 09:58:51 EEST 172.18.10.34 db2 DETAIL: Could not open file > > > "pg_multixact/members/11145": No such file or directory. > > > 2014-06-02 09:58:51 EEST 172.18.10.38 db2 DETAIL: Could not open file > > > "pg_multixact/members/13F76": No such file or directory > > > > Are these the only files missing? Are intermediate files there? > > Only 0000 - 001E files were present on slave server. I don't understand how can files be missing in the replica. pg_basebackup simply copies all files it can find in the master to the replica, so if the 111xx files are present in the master they should certainly be present in the replica as well. I gave the pg_basebackup code a look just to be sure there are no 4-char pattern matching or something like that, and it doesn't look like it attempts to do that at all. I also asked Magnus just to be sure and he confirms this. I'm playing a bit more with this test case, I'll let you know where it leads. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Alvaro Herrera wrote: > I wasn't able to reproduce it that way, but I eventually figured out > that if I altered altered the plpython function to grab a FOR NO KEY > UPDATE lock first, insertion would grow the multixact beyond reasonable > limits; see the attachment. If you then INSERT many tuples in "product" > in a single transaction, the resulting xmax is a Multixact that has as > many members as inserts there are, plus one. Bah. I managed to submit a file with the lock_company() call commented out in the plpython function. That one doesn't do anything interesting; you need to uncomment that line. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Serge Negodyuck wrote: > 2014-06-02 08:20:55 EEST 172.18.10.4 db PANIC: could not access status of > transaction 2080547 > 2014-06-02 08:20:55 EEST 172.18.10.4 db DETAIL: Could not open file > "pg_multixact/members/14078": No such file or directory. > 2014-06-02 08:20:55 EEST 172.18.10.4 db CONTEXT: SQL statement " UPDATE > ....." So as it turns out, this was caused because the arithmetic to handle the wraparound case neglected to handle multixacts with more members than the number that fit in the last page(s) of the last segment, leading to a number of pages in the 14078 segment (or whatever the last segment is for a given BLCKSZ) to fail to be initialized. This patch is a rework of that arithmetic, although it seems little bit too obscure. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Alvaro Herrera wrote: > Serge Negodyuck wrote: > > > 2014-06-02 08:20:55 EEST 172.18.10.4 db PANIC: could not access status of > > transaction 2080547 > > 2014-06-02 08:20:55 EEST 172.18.10.4 db DETAIL: Could not open file > > "pg_multixact/members/14078": No such file or directory. > > 2014-06-02 08:20:55 EEST 172.18.10.4 db CONTEXT: SQL statement " UPDATE > > ....." > > So as it turns out, this was caused because the arithmetic to handle the > wraparound case neglected to handle multixacts with more members than > the number that fit in the last page(s) of the last segment, leading to > a number of pages in the 14078 segment (or whatever the last segment is > for a given BLCKSZ) to fail to be initialized. This patch is a rework > of that arithmetic, although it seems little bit too obscure. After some simplification I think it should be clearer. Thanks Andres for commenting offlist. There is a different way to compute the "difference" proposed by Andres, without using if/else; the idea is to cast the values to int64 and then clamp. It would be something like uint64 diff64; diff64 = Min((uint64) offset + nmembers, (uint64) offset + MULTIXACT_MEMBERS_PER_PAGE); difference = (uint32) Min(diff64, MaxMultiXactOffset); (There are other ways to formulate this, of course, but this seems to me to be the most readable one). I am undecided between this and the one I propose in the patch, so I've stuck with the patch. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > After some simplification I think it should be clearer. Thanks Andres > for commenting offlist. I find this not only unreadable but a bit scary: ! if (offset + MAX_MEMBERS_IN_LAST_MEMBERS_PAGE < offset) I'm worried whether some compilers might not decide that that's a can't-happen situation, and optimize the test away entirely. I think Andres' proposal is probably safer (though it needs comments). regards, tom lane
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Andres Freund
Date:
On 2014-06-06 13:04:29 -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > After some simplification I think it should be clearer. Thanks Andres > > for commenting offlist. > > I find this not only unreadable but a bit scary: > > ! if (offset + MAX_MEMBERS_IN_LAST_MEMBERS_PAGE < offset) > > I'm worried whether some compilers might not decide that that's a > can't-happen situation, and optimize the test away entirely. offset is a (via typedefs) uint32 and unsigned overflow is defined in the C standard, so that shouldn't be a allowed optimization. Don't we already have a fair amount of similar tests around? Around TransactionId wraparound for exzmple: xidWrapLimit = oldest_datfrozenxid + (MaxTransactionId >> 1); if (xidWrapLimit < FirstNormalTransactionId) xidWrapLimit += FirstNormalTransactionId; and similar things. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > After some simplification I think it should be clearer. Thanks Andres > > for commenting offlist. > > I find this not only unreadable but a bit scary: > > ! if (offset + MAX_MEMBERS_IN_LAST_MEMBERS_PAGE < offset) > > I'm worried whether some compilers might not decide that that's a > can't-happen situation, and optimize the test away entirely. I think > Andres' proposal is probably safer (though it needs comments). Well, somebody would have to work it out fully and test it -- I can't spend more time on this issue. If somebody is interested in doing that, I used this little program and make sure that it "zeroes" all pages, for various block sizes, initial offsets and "step sizes" (#members in each multi) -- I was able to reproduce Serge's issues with the original coding, and then watch issues fly by as I tried several approaches until I settled on the patch as submitted. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Serge Negodyuck wrote: Hi, > /var/log/postgresql/postgresql.log: > > 2014-06-02 08:20:55 EEST 172.18.10.4 db PANIC: could not access status of > transaction 2080547 > 2014-06-02 08:20:55 EEST 172.18.10.4 db DETAIL: Could not open file > "pg_multixact/members/14078": No such file or directory. > 2014-06-02 08:20:55 EEST 172.18.10.4 db CONTEXT: SQL statement " UPDATE > ....." Pushed a fix for this. Thanks for the report. > 2014-06-02 08:22:30 EEST FATAL: could not access status of transaction > 2080547 > 2014-06-02 08:22:30 EEST DETAIL: Could not read from file > "pg_multixact/members/14078" at offset 24576: Success. > 2014-06-02 08:22:30 EEST CONTEXT: xlog redo create mxid 2080547 offset > 4294961608 nmembers 8684: 6193231 (keysh) 6193233 (fornokeyupd) 6193234 > (keysh) 6193235 (fornokeyupd) 6193236 (keysh) 6193237 (fornokeyupd) 6193238 > (keysh) 6193239 (fornokeyupd) 6193240 (keysh) 6193241 (fornokeyupd) 6193242 > (keysh) 6193243 (fornokeyupd) 6193244 (keysh) 6193245 (fornokeyupd) 6193246 > (keysh) 6193247 (fornokeyupd) 6193248 (keysh) 6193249 (fornokeyupd) 6193250 > (keysh) 6193251 (fornokeyupd) 6193252 (keysh) 6193253 (fornokeyupd) 6193254 > (keysh) 6193255 (fornokeyupd) 6193256 (keysh) 6193257 ....... I find this bit rather odd. Normally the system shouldn't create multixacts this large. I think we might be missing a trick here somewhere. I imagine inserting the last few items is slow, isn't it? > An ugly hack "cp pg_multixact/members/14077 pg_multixact/members/14078" > helped me to start master server in replica. > > > Then, did pg_basebackup to slave database. It does not help > 2014-06-02 09:58:49 EEST 172.18.10.17 db2 DETAIL: Could not open file > "pg_multixact/members/1112D": No such file or directory. > 2014-06-02 09:58:49 EEST 172.18.10.18 db2 DETAIL: Could not open file > "pg_multixact/members/11130": No such file or directory. > 2014-06-02 09:58:51 EEST 172.18.10.34 db2 DETAIL: Could not open file > "pg_multixact/members/11145": No such file or directory. > 2014-06-02 09:58:51 EEST 172.18.10.38 db2 DETAIL: Could not open file > "pg_multixact/members/13F76": No such file or directory This is strange also; if the files are present in master, how come they weren't copied to the replica? I think we need more info about this problem. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Jeff Janes
Date:
On Fri, Jun 6, 2014 at 9:47 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Alvaro Herrera wrote: > > Serge Negodyuck wrote: > > > > > 2014-06-02 08:20:55 EEST 172.18.10.4 db PANIC: could not access status > of > > > transaction 2080547 > > > 2014-06-02 08:20:55 EEST 172.18.10.4 db DETAIL: Could not open file > > > "pg_multixact/members/14078": No such file or directory. > > > 2014-06-02 08:20:55 EEST 172.18.10.4 db CONTEXT: SQL statement " > UPDATE > > > ....." > > > > So as it turns out, this was caused because the arithmetic to handle the > > wraparound case neglected to handle multixacts with more members than > > the number that fit in the last page(s) of the last segment, leading to > > a number of pages in the 14078 segment (or whatever the last segment is > > for a given BLCKSZ) to fail to be initialized. This patch is a rework > > of that arithmetic, although it seems little bit too obscure. > > After some simplification I think it should be clearer. Thanks Andres > for commenting offlist. > > There is a different way to compute the "difference" proposed by Andres, > without using if/else; the idea is to cast the values to int64 and then > clamp. It would be something like > > uint64 diff64; > > diff64 = Min((uint64) offset + nmembers, > (uint64) offset + MULTIXACT_MEMBERS_PER_PAGE); > difference = (uint32) Min(diff64, MaxMultiXactOffset); > > (There are other ways to formulate this, of course, but this seems to me > to be the most readable one). I am undecided between this and the one I > propose in the patch, so I've stuck with the patch. > This patch seems to solve a problem I've also been having with non-existent "pg_multixact/members/13D35" files in my testing of torn page write and fk locks again recent 9.4. However, sometimes I've found the problem even before multi wrap around occured, and my reading of this thread suggests that the current issue would not show up in that case, so I don't know how much comfort to take in the knowledge that the problem seems to have gone away--as they say, things that go away by themselves without explanation also tend to come back by themselves. Anyone wishing to investigate further can find the testing harness and the tarball of a bad data directory here: https://drive.google.com/folderview?id=0Bzqrh1SO9FcEb1FNcm52aEMtWTA&usp=sharing Starting up the bad data directory should nominally succeed, but then the first database wide vacuum should fail when it hits the bad tuple. Recent changes to my testing harness were: Making some of the transactions randomly rollback in the test harness, to test transaction aborts not associated with server crashes. Make the p_id in the child table get changed only half the time, not all the time, so that a mixture of HOT updates and non-HOT updates get tested. Add a patch to allow fast-forward of multixact and add a code to my harness to trigger that patch. It is possible that this patch itself is causing the problem, but I suspect it is just accelerating the discovery of it. (By the way, I did this wrong for my original intent, as I had it create quite large multixact, when I should have had it create more of them but smaller, so wrap around would occur more often. But I don't want to change it now until the problem is resolved) Add a delay.pl which delivers SIGSTOP to random postgres processes and then wait a while to SIGCONT them, to try to uncover unlikely races. Surprisingly this does not slow things by very much. I thought it would be frequent that processes got interrupted while holding important locks, but it actually seems to be rare. Turn on archive_mode and set wal_level to logical. The problem is that the last two of them (the delay.pl and the archive) I can reverse and still see the problem, while the other 3 were extensively tested elsewhere without seeing a problem, so I can't figure out what the trigger is. When the problem shows up it takes anywhere from 1 to 13 hours to do so. Anyway, if no one has opinions to the contrary I think I will just assume this is fixed now and move on to other tests. Cheers, Jeff
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Jeff Janes wrote: > This patch seems to solve a problem I've also been having with non-existent > "pg_multixact/members/13D35" files in my testing of torn page write and fk > locks again recent 9.4. However, sometimes I've found the problem even > before multi wrap around occured, and my reading of this thread suggests > that the current issue would not show up in that case, so I don't know how > much comfort to take in the knowledge that the problem seems to have gone > away--as they say, things that go away by themselves without explanation > also tend to come back by themselves. I think this problem is rather related to the code touched by commit 722acf51a0d074d19782ad7e97ebe3fdb63fbb87, which is about removing unneeded files in pg_multixact/members. I checked your files, and it seems that pg_multixact/members files that should still exist are gone; for instance, pg_controldata says that the oldest multi is 2586314407, but if I run select * from pg_get_multixact_members('that value') I get an error saying that pg_multixact/members/218B does not exist. It seems that there are a lot of files that were improperly removed. I expect that existing files should be sequential, but there are a lot of holes in a listing of pg_multixact/members. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Alvaro Herrera
Date:
Serge Negodyuck wrote: > 2014-06-09 22:49 GMT+03:00 Alvaro Herrera <alvherre@2ndquadrant.com>: > > > An ugly hack "cp pg_multixact/members/14077 pg_multixact/members/14078" > > > helped me to start master server in replica. > > > > > > Then, did pg_basebackup to slave database. It does not help > > > 2014-06-02 09:58:49 EEST 172.18.10.17 db2 DETAIL: Could not open file > > > "pg_multixact/members/1112D": No such file or directory. > > > 2014-06-02 09:58:49 EEST 172.18.10.18 db2 DETAIL: Could not open file > > > "pg_multixact/members/11130": No such file or directory. > > > 2014-06-02 09:58:51 EEST 172.18.10.34 db2 DETAIL: Could not open file > > > "pg_multixact/members/11145": No such file or directory. > > > 2014-06-02 09:58:51 EEST 172.18.10.38 db2 DETAIL: Could not open file > > > "pg_multixact/members/13F76": No such file or directory > > > > This is strange also; if the files are present in master, how come they > > weren't copied to the replica? I think we need more info about this > > problem. > I've thoroughly looked through the logs once again and I have not > found anything interesting. > I just know there were very few pg_multixact/members files starting > from 0000. It was on both slave servers. So I've observed this issue > two times. But it was OK on master, right? It had the full set of files and no file got remove improperly? (minus 14078, which wasn't created on time, of course) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Serge Negodyuck
Date:
2014-06-09 22:49 GMT+03:00 Alvaro Herrera <alvherre@2ndquadrant.com>: > > Pushed a fix for this. Thanks for the report. Thank you! > > > 2014-06-02 08:22:30 EEST FATAL: could not access status of transaction > > 2080547 > > 2014-06-02 08:22:30 EEST DETAIL: Could not read from file > > "pg_multixact/members/14078" at offset 24576: Success. > > 2014-06-02 08:22:30 EEST CONTEXT: xlog redo create mxid 2080547 offset > > 4294961608 nmembers 8684: 6193231 (keysh) 6193233 (fornokeyupd) 6193234 > > (keysh) 6193235 (fornokeyupd) 6193236 (keysh) 6193237 (fornokeyupd) 6193238 > > (keysh) 6193239 (fornokeyupd) 6193240 (keysh) 6193241 (fornokeyupd) 6193242 > > (keysh) 6193243 (fornokeyupd) 6193244 (keysh) 6193245 (fornokeyupd) 6193246 > > (keysh) 6193247 (fornokeyupd) 6193248 (keysh) 6193249 (fornokeyupd) 6193250 > > (keysh) 6193251 (fornokeyupd) 6193252 (keysh) 6193253 (fornokeyupd) 6193254 > > (keysh) 6193255 (fornokeyupd) 6193256 (keysh) 6193257 ....... > > I find this bit rather odd. Normally the system shouldn't create > multixacts this large. I think we might be missing a trick here somewhere. > I imagine inserting the last few items is slow, isn't it? Yes, the duration of inserts have been growing up to 2.2 seconds before crash: 2014-06-02 08:20:11 EEST 172.18.10.4 db LOG: duration: 2213.361 ms statement: INSERT INTO product (...) VALUES (...) RETURNING product.id Normally inserts fit in to 100ms (log_min_duration_statement) The same log "xlog redo create mxid 2080547...." was present on master and both replica servers. Well, this sounds logical. > > > An ugly hack "cp pg_multixact/members/14077 pg_multixact/members/14078" > > helped me to start master server in replica. > > > > > > Then, did pg_basebackup to slave database. It does not help > > 2014-06-02 09:58:49 EEST 172.18.10.17 db2 DETAIL: Could not open file > > "pg_multixact/members/1112D": No such file or directory. > > 2014-06-02 09:58:49 EEST 172.18.10.18 db2 DETAIL: Could not open file > > "pg_multixact/members/11130": No such file or directory. > > 2014-06-02 09:58:51 EEST 172.18.10.34 db2 DETAIL: Could not open file > > "pg_multixact/members/11145": No such file or directory. > > 2014-06-02 09:58:51 EEST 172.18.10.38 db2 DETAIL: Could not open file > > "pg_multixact/members/13F76": No such file or directory > > This is strange also; if the files are present in master, how come they > weren't copied to the replica? I think we need more info about this > problem. I've thoroughly looked through the logs once again and I have not found anything interesting. I just know there were very few pg_multixact/members files starting from 0000. It was on both slave servers. So I've observed this issue two times. To fix it I had to do pg_dumpall | pg_restore on master. So, I'm sorry, I have no additional info about this problem.
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
From
Serge Negodyuck
Date:
2014-06-10 16:48 GMT+03:00 Alvaro Herrera <alvherre@2ndquadrant.com>: > Serge Negodyuck wrote: >> 2014-06-09 22:49 GMT+03:00 Alvaro Herrera <alvherre@2ndquadrant.com>: > >> > > An ugly hack "cp pg_multixact/members/14077 pg_multixact/members/14078" >> > > helped me to start master server in replica. >> > > >> > > Then, did pg_basebackup to slave database. It does not help >> > > 2014-06-02 09:58:49 EEST 172.18.10.17 db2 DETAIL: Could not open file >> > > "pg_multixact/members/1112D": No such file or directory. >> > > 2014-06-02 09:58:49 EEST 172.18.10.18 db2 DETAIL: Could not open file >> > > "pg_multixact/members/11130": No such file or directory. >> > > 2014-06-02 09:58:51 EEST 172.18.10.34 db2 DETAIL: Could not open file >> > > "pg_multixact/members/11145": No such file or directory. >> > > 2014-06-02 09:58:51 EEST 172.18.10.38 db2 DETAIL: Could not open file >> > > "pg_multixact/members/13F76": No such file or directory >> > >> > This is strange also; if the files are present in master, how come they >> > weren't copied to the replica? I think we need more info about this >> > problem. >> I've thoroughly looked through the logs once again and I have not >> found anything interesting. >> I just know there were very few pg_multixact/members files starting >> from 0000. It was on both slave servers. So I've observed this issue >> two times. > > But it was OK on master, right? It had the full set of files and no > file got remove improperly? (minus 14078, which wasn't created on time, > of course) Yes,all files were persent on master (except for corrupted file 14078, which was copied from the 14077, it was the fastest solution to start master server)