Thread: BUG #11264: Auto vacuum wraparound job blocking everything
The following bug has been logged on the website: Bug reference: 11264 Logged by: Dines Bhandary Email address: dbhandary@switchfly.com PostgreSQL version: 9.3.5 Operating system: Centos 6.5 Description: We recently upgraded to postgressql 9.3.5 from postgres 9.1 using pg_upgrade. We are running streaming replication. DB was functioning without any issues till today where we discovered an auto vacuum wraparound job on one of our tables which was blocking everything. There were no waiting jobs when we queried pg_stat_activity, but incoming request would just hang, so database was unusable. We tried to disable auto vacuum for a table in question and tried kill auto vacuum job, but that did not work. Also we had pg_dump running for several hours for that particular table where auto vacuum was running. We tried killing pg_dump job, but it did not succeed. We eventually had to force shutdown the database and disable auto vacuum and increase vacuum_freeze parameter so that auto vacuum wraparound job won't trigger until we figure out what the issue is. I am including debug trace here. It seems like auto vacuum is running in an infinite loop. Will be happy to provide more info if necessary. (gdb) bt #0 0x00007fb9740eb5b3 in __select_nocancel () from /lib64/libc.so.6 #1 0x000000000075ad7a in pg_usleep () #2 0x00000000004ae2d4 in GetMultiXactIdMembers () #3 0x000000000047f6cc in heap_prepare_freeze_tuple () #4 0x000000000057d036 in lazy_vacuum_rel () #5 0x000000000057b346 in ?? () #6 0x000000000057b67c in vacuum () #7 0x000000000060cf8a in ?? () #8 0x000000000060d426 in ?? () #9 0x000000000060d4f6 in StartAutoVacWorker () #10 0x0000000000617d42 in ?? () #11 <signal handler called> #12 0x00007fb9740eb5b3 in __select_nocancel () from /lib64/libc.so.6 #13 0x00000000006191fd in PostmasterMain () #14 0x00000000005b5e60 in main () (gdb) 16423 | jetblue | 21972 | 10 | postgres | | | | | 2014-08-24 18:39:38.134514+00 | 2014-08-24 18:39:38.161612+00 | 2014-08-24 18:39:38.161612+00 | 2014-08-24 18:39:38.161614+00 | f | active | autovacuum: VACUUM settings.room_contract_service_code (to prevent wraparound) COPY: jetblue=# select * from pg_stat_activity where query ilike '%copy%'; datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | waiting | state | query -------+---------+-------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+---------+--------+-------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------ 16423 | jetblue | 30246 | 20350 | m84 | pg_dump | | | -1 | 2014-08-25 03:48:59.807689+00 | 2014-08-25 03:48:59.810871+00 | 2014-08-25 04:26:14.95753+00 | 2014-08-25 04:26:14.957534+00 | f | active | COPY settings.room_contract_service_code (room_contract_service_code, room_contract_service_description , room_contract_service_code_id, supplierid, inactive) TO stdout; 16423 | jetblue | 31999 | 20350 | m84 | | 10.33.11.11 | | 40735 | 2014-08-25 10:38:02.944083+00 | 2014-08-25 10:38:02.964562+00 | 2014-08-25 10:38:22.805859+00 | 2014-08-25 10:38:22.805861+00 | f | active | COPY settings.room_contract_service_code (room_contract_service_code, room_contract_service_description , room_contract_service_code_id, supplierid, inactive) TO stdout; (gdb) bt #0 0x00007fb9740f5187 in semop () from /lib64/libc.so.6 #1 0x0000000000609397 in PGSemaphoreLock () #2 0x000000000064e871 in LWLockAcquire () #3 0x000000000047e1a9 in ?? () #4 0x000000000047e66a in ?? () #5 0x000000000047f0a6 in heap_getnext () #6 0x00000000005343e6 in ?? () #7 0x000000000053521a in DoCopy () #8 0x000000000066112d in standard_ProcessUtility () #9 0x00007fb96c4df261 in ?? () from /usr/pgsql-9.3/lib/pg_stat_statements.so #10 0x000000000065db97 in ?? () #11 0x000000000065eb2d in ?? () #12 0x000000000065f223 in PortalRun () #13 0x000000000065b91e in ?? () #14 0x000000000065d048 in PostgresMain () #15 0x0000000000619c36 in PostmasterMain () #16 0x00000000005b5e60 in main () SELECTS: datid | 16423 datname | jetblue pid | 15599 usesysid | 20354 usename | developers application_name | client_addr | 10.33.11.11 client_hostname | client_port | 60741 backend_start | 2014-08-25 20:20:38.985486+00 xact_start | 2014-08-25 20:39:13.426937+00 query_start | 2014-08-25 20:39:13.497362+00 state_change | 2014-08-25 20:39:13.497365+00 waiting | f state | active query | | SELECT room_contract_service_code, room_contract_service_description, room_contract_service_code_id | FROM room_contract_service_code | WHERE room_contract_service_code.supplierid = $1 | AND NOT inactive | ORDER by room_contract_service_description (gdb) bt #0 0x00007fb9740f5187 in semop () from /lib64/libc.so.6 #1 0x0000000000609397 in PGSemaphoreLock () #2 0x000000000064e871 in LWLockAcquire () #3 0x000000000048d05d in index_fetch_heap () #4 0x000000000048d22e in index_getnext () #5 0x0000000000595e96 in ?? () #6 0x000000000058aabe in ExecScan () #7 0x0000000000583c08 in ExecProcNode () #8 0x000000000059c4e9 in ExecSort () #9 0x0000000000583b28 in ExecProcNode () #10 0x0000000000582a42 in standard_ExecutorRun () #11 0x00007fb96c4df4db in ?? () from /usr/pgsql-9.3/lib/pg_stat_statements.so #12 0x000000000065df67 in ?? () #13 0x000000000065f191 in PortalRun () #14 0x000000000065d3ed in PostgresMain () #15 0x0000000000619c36 in PostmasterMain () #16 0x00000000005b5e60 in main () (gdb)
dbhandary@switchfly.com wrote: > I am including debug trace here. It seems like auto vacuum is running in an > infinite loop. Will be happy to provide more info if necessary. > > (gdb) bt > #0 0x00007fb9740eb5b3 in __select_nocancel () from /lib64/libc.so.6 > #1 0x000000000075ad7a in pg_usleep () > #2 0x00000000004ae2d4 in GetMultiXactIdMembers () > #3 0x000000000047f6cc in heap_prepare_freeze_tuple () > #4 0x000000000057d036 in lazy_vacuum_rel () Can you please provide pg_controldata output? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi Alvaro, Here is the output of pg_controldata. We also looked at latest 9.3.5 fix fo= r pg_upgrade and ran the query to check if we have to remove $PGDATA/pg_mu= ltixact/offsets/0000, but the query returned false.=20 pg_control version number: 937 Catalog version number: 201306121 Database system identifier: 6037642065780369403 Database cluster state: in production pg_control last modified: Tue 26 Aug 2014 02:40:19 PM UTC Latest checkpoint location: 1054/17E0BE30 Prior checkpoint location: 1054/15B2DEB8 Latest checkpoint's REDO location: 1054/16EC6028 Latest checkpoint's REDO WAL file: 000000010000105400000016 Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0/1004399358 Latest checkpoint's NextOID: 1303270531 Latest checkpoint's NextMultiXactId: 23431 Latest checkpoint's NextMultiOffset: 5678 Latest checkpoint's oldestXID: 801997459 Latest checkpoint's oldestXID's DB: 16428 Latest checkpoint's oldestActiveXID: 1004399358 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 16423 Time of latest checkpoint: Tue 26 Aug 2014 02:37:48 PM UTC Fake LSN counter for unlogged rels: 0/1 Minimum recovery ending location: 0/0 Min recovery ending loc's timeline: 0 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no Current wal_level setting: hot_standby Current max_connections setting: 1024 Current max_prepared_xacts setting: 0 Current max_locks_per_xact setting: 64 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 Thanks Dinesh -----Original Message----- From: Alvaro Herrera [mailto:alvherre@2ndquadrant.com]=20 Sent: Monday, August 25, 2014 9:03 PM To: Dinesh Bhandary Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #11264: Auto vacuum wraparound job blocking everyth= ing dbhandary@switchfly.com wrote: > I am including debug trace here. It seems like auto vacuum is running=20 > in an infinite loop. Will be happy to provide more info if necessary. >=20 > (gdb) bt > #0 0x00007fb9740eb5b3 in __select_nocancel () from /lib64/libc.so.6 > #1 0x000000000075ad7a in pg_usleep () > #2 0x00000000004ae2d4 in GetMultiXactIdMembers () > #3 0x000000000047f6cc in heap_prepare_freeze_tuple () > #4 0x000000000057d036 in lazy_vacuum_rel () Can you please provide pg_controldata output? --=20 =C1lvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
dbhandary@switchfly.com wrote: > We recently upgraded to postgressql 9.3.5 from postgres 9.1 using > pg_upgrade. We are running streaming replication. DB was functioning without > any issues till today where we discovered an auto vacuum wraparound job on > one of our tables which was blocking everything. There were no waiting jobs > when we queried pg_stat_activity, but incoming request would just hang, so > database was unusable. Can you please paste select oid::regclass, relfrozenxid, relminmxid from pg_class where relname = 'room_contract_service_code'; select datname, datfrozenxid, datminmxid from pg_database where datname = 'jetblue'; > (gdb) bt > #0 0x00007fb9740eb5b3 in __select_nocancel () from /lib64/libc.so.6 > #1 0x000000000075ad7a in pg_usleep () > #2 0x00000000004ae2d4 in GetMultiXactIdMembers () > #3 0x000000000047f6cc in heap_prepare_freeze_tuple () > #4 0x000000000057d036 in lazy_vacuum_rel () This stall in GetMultiXactIdMembers should only occur when somebody is writing the multixact immediately following. It certainly should not sleep for long. I'm not sure what's happening here but this is probably where the problem is. While waiting, it holds the buffer content lock in exclusive mode. Can you install debug symbols and pageinspect? I'm curious why this is stalling. If the tuple has a FOR SHARE or FOR UPDATE marking from before the upgrade, it shouldn't be inquiring the multixact members at all. I'm curious about Latest checkpoint's NextMultiXactId: 23431 Latest checkpoint's NextMultiOffset: 5678 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 16423 It says the oldest multi is 1, so the database should not have any values between 1 and 23431 that correspond to pg_upgrade'd multixact values ... so what is going on here? Unless the recent mucking with pg_upgrade to handle multixact's got something wrong. > 10.33.11.11 | | 40735 | 2014-08-25 10:38:02.944083+00 > | 2014-08-25 10:38:02.964562+00 | 2014-08-25 10:38:22.805859+00 | 2014-08-25 > 10:38:22.805861+00 | f | active | COPY > settings.room_contract_service_code (room_contract_service_code, > room_contract_service_description > , room_contract_service_code_id, supplierid, inactive) TO stdout; > > (gdb) bt > #0 0x00007fb9740f5187 in semop () from /lib64/libc.so.6 > #1 0x0000000000609397 in PGSemaphoreLock () > #2 0x000000000064e871 in LWLockAcquire () > #3 0x000000000047e1a9 in ?? () > #4 0x000000000047e66a in ?? () > #5 0x000000000047f0a6 in heap_getnext () > #6 0x00000000005343e6 in ?? () > #7 0x000000000053521a in DoCopy () > #8 0x000000000066112d in standard_ProcessUtility () > #9 0x00007fb96c4df261 in ?? () from /usr/pgsql-9.3/lib/pg_stat_statements.so This one is stalling on the buffer content lock trying to read the table to dump it. > SELECTS: > > query | > | SELECT room_contract_service_code, > room_contract_service_description, room_contract_service_code_id > | FROM room_contract_service_code > | WHERE room_contract_service_code.supplierid = $1 > | AND NOT inactive > | ORDER by room_contract_service_description > (gdb) bt > #0 0x00007fb9740f5187 in semop () from /lib64/libc.so.6 > #1 0x0000000000609397 in PGSemaphoreLock () > #2 0x000000000064e871 in LWLockAcquire () > #3 0x000000000048d05d in index_fetch_heap () > #4 0x000000000048d22e in index_getnext () And this one is stalling in the same buffer lock most likely. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera wrote: > > (gdb) bt > > #0 0x00007fb9740eb5b3 in __select_nocancel () from /lib64/libc.so.6 > > #1 0x000000000075ad7a in pg_usleep () > > #2 0x00000000004ae2d4 in GetMultiXactIdMembers () > > #3 0x000000000047f6cc in heap_prepare_freeze_tuple () > > #4 0x000000000057d036 in lazy_vacuum_rel () > > This stall in GetMultiXactIdMembers should only occur when somebody is > writing the multixact immediately following. It certainly should not > sleep for long. I'm not sure what's happening here but this is probably > where the problem is. While waiting, it holds the buffer content lock > in exclusive mode. > > Can you install debug symbols and pageinspect? If you have debug symbols, please attach to this process, then do in GDB: bt full -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Dinesh Bhandary wrote: > We used psotgres 9.3.4 to pg_upgrade from 9.1. We just upgraded > binaries to 9.3.5 just last week ? Would this cause any issues with > MultiXid issue. So you ran 200 million write transactions in one week? Do you still have the 9.1 data from before the upgrade? If so, would you paste pg_controldata output from that? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi Alvaro, We did manual vacuum and reindex of room_contract_service_code and it finis= hed quickly. We still have auto vacuum disabled. 1 ) select oid::regclass, relfrozenxid, relminmxid from pg_class where reln= ame =3D 'room_contract_service_code'; select datname, datfrozenxid, datminm= xid from pg_database where datname =3D 'jetblue'; jetblue=3D# select oid::regclass, relfrozenxid, relminmxid from pg_class w= here relname =3D 'room_contract_service_code' jetblue-# ; oid | relfrozenxid | relminmxid ----------------------------+--------------+------------ room_contract_service_code | 953567787 | 20784 (1 row) jetblue=3D# select datname, datfrozenxid, datminmxid from pg_database where= datname =3D 'jetblue'; datname | datfrozenxid | datminmxid ---------+--------------+------------ jetblue | 870700471 | 1 (1 row) 2) Latest checkpoint's NextMultiXactId: 23431 = = =20 Latest checkpoint's NextMultiOffset: 5678 = = =20 Latest checkpoint's oldestMultiXid: 1 = = =20 Latest checkpoint's oldestMulti's DB: 16423 = = =20 It says the oldest multi is 1, so the database should not have any values b= etween 1 and 23431 that correspond to pg_upgrade'd multixact values ... so = what is going on here? Unless the recent mucking with pg_upgrade to handle= multixact's got something wrong. We used psotgres 9.3.4 to pg_upgrade from 9.1. We just upgraded binaries to= 9.3.5 just last week ? Would this cause any issues with MultiXid issue. Thanks Dinesh
Dinesh Bhandary wrote: > Here is the pg_controldata output from 9.1. Thanks. Dinesh > > $ /usr/pgsql-9.1/bin/pg_controldata data > Time of latest checkpoint: Sat 19 Jul 2014 07:02:38 AM GMT Ah, so you upgraded a month ago, not one week ago, right? > Latest checkpoint's NextMultiXactId: 20783 I'm betting this is the problem. The 9.3 cluster should have oldestMulti=20783, but instead it has oldestMulti=1. Either pg_upgrade is misreading the nextMulti value from the old cluster, or pg_resetxlog is failing to set oldestMulti appropriately ... or something funny happened to pg_control in the meantime. I just ran a quick test, and I see that pg_upgrade is setting oldestMulti correctly (to the old cluster's NextMultiXactId value) in the new cluster when upgrading from 9.1. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Here is the pg_controldata output from 9.1. Thanks. Dinesh $ /usr/pgsql-9.1/bin/pg_controldata data pg_control version number: 903 Catalog version number: 201105231 Database system identifier: 5720710153987476471 Database cluster state: in production pg_control last modified: Sat 19 Jul 2014 07:05:08 AM GMT Latest checkpoint location: FC0/25643788 Prior checkpoint location: FC0/169F2238 Latest checkpoint's REDO location: FC0/1D919830 Latest checkpoint's TimeLineID: 1 Latest checkpoint's NextXID: 0/950483638 Latest checkpoint's NextOID: 1146307754 Latest checkpoint's NextMultiXactId: 20783 Latest checkpoint's NextMultiOffset: 42669 Latest checkpoint's oldestXID: 750608965 Latest checkpoint's oldestXID's DB: 16411 Latest checkpoint's oldestActiveXID: 0 Time of latest checkpoint: Sat 19 Jul 2014 07:02:38 AM GMT Minimum recovery ending location: 0/0 Backup start location: 0/0 Current wal_level setting: minimal Current max_connections setting: 1024 Current max_prepared_xacts setting: 0 Current max_locks_per_xact setting: 256 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value -----Original Message----- From: Alvaro Herrera [mailto:alvherre@2ndquadrant.com]=20 Sent: Wednesday, August 27, 2014 8:48 AM To: Dinesh Bhandary Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #11264: Auto vacuum wraparound job blocking everyth= ing Dinesh Bhandary wrote: > We used psotgres 9.3.4 to pg_upgrade from 9.1. We just upgraded=20 > binaries to 9.3.5 just last week ? Would this cause any issues with=20 > MultiXid issue. So you ran 200 million write transactions in one week? Do you still have the 9.1 data from before the upgrade? If so, would you p= aste pg_controldata output from that? --=20 =C1lvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Dinesh Bhandary wrote: > Yes, it was upgraded a month ago using postgresql 9.3.4 binaries. It was > just patched up last week with 9.3.5. Ok, that makes a lot more sense. > I checked all the other servers which we upgraded using postgres 9.3.5 > binaries, oldestMultiXid is other than 1 and it seems to be have picked > the right number. But the server in question was upgraded using postgres > 9.3.4 binaries. This sounds suspicious. Could it be pg_upgrade in 9.3.4 is > buggy. Well, yes, 9.3.4 had a bug fixed by this commit: Author: Bruce Momjian <bruce@momjian.us> Branch: master [a61daa14d] 2014-07-02 15:29:38 -0400 Branch: REL9_4_STABLE [b446a384b] 2014-07-02 15:29:38 -0400 Branch: REL9_3_STABLE Release: REL9_3_5 [3d2e18510] 2014-07-02 15:29:38 -0400 pg_upgrade: preserve database and relation minmxid values Also set these values for pre-9.3 old clusters that don't have values to preserve. Analysis by Alvaro Backpatch through 9.3 > How do we fix the current issue with this one server? Is there an easy fix? > Thanks. As far as I am aware, you should UPDATE pg_database SET datminmxid=20783 and that should fix it. The oldestMulti value in pg_control would get updated by itself some time later. If you experience stalls before oldestMulti fixes itself, you could stop the server (cleanly!) and then pg_resetxlog -m x,y where x is the correct nextMulti value from pg_controldata and y is 20783. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Yes, it was upgraded a month ago using postgresql 9.3.4 binaries. It was just patched up last week with 9.3.5. I checked all the other servers which we upgraded using postgres 9.3.5 binaries, oldestMultiXid is other than 1 and it seems to be have picked the right number. But the server in question was upgraded using postgres 9.3.4 binaries. This sounds suspicious. Could it be pg_upgrade in 9.3.4 is buggy. How do we fix the current issue with this one server? Is there an easy fix? Thanks. Dinesh On 8/27/14, 9:45 AM, "Alvaro Herrera" <alvherre@2ndquadrant.com> wrote: >Dinesh Bhandary wrote: >> Here is the pg_controldata output from 9.1. Thanks. Dinesh >>=20 >> $ /usr/pgsql-9.1/bin/pg_controldata data > >> Time of latest checkpoint: Sat 19 Jul 2014 07:02:38 AM GMT > >Ah, so you upgraded a month ago, not one week ago, right? > > >> Latest checkpoint's NextMultiXactId: 20783 > >I'm betting this is the problem. The 9.3 cluster should have >oldestMulti=3D20783, but instead it has oldestMulti=3D1. Either pg_upgrad= e >is misreading the nextMulti value from the old cluster, or pg_resetxlog >is failing to set oldestMulti appropriately ... or something funny >happened to pg_control in the meantime. I just ran a quick test, and I >see that pg_upgrade is setting oldestMulti correctly (to the old >cluster's NextMultiXactId value) in the new cluster when upgrading from >9.1. > >--=20 >=C1lvaro Herrera http://www.2ndQuadrant.com/ >PostgreSQL Development, 24x7 Support, Training & Services
Hi Alvaro, Thank you so much for jumping onto this issue. I greatly appreciate it. We will be testing the fix that you have mentioned in test env first. I will let you know when we are done with testing. Then we can update production db. Dinesh -----Original Message----- From: Alvaro Herrera [mailto:alvherre@2ndquadrant.com] Sent: Wednesday, August 27, 2014 10:51 AM To: Dinesh Bhandary Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #11264: Auto vacuum wraparound job blocking everything Dinesh Bhandary wrote: > Yes, it was upgraded a month ago using postgresql 9.3.4 binaries. It > was just patched up last week with 9.3.5. Ok, that makes a lot more sense. > I checked all the other servers which we upgraded using postgres > 9.3.5 binaries, oldestMultiXid is other than 1 and it seems to be have > picked the right number. But the server in question was upgraded using > postgres > 9.3.4 binaries. This sounds suspicious. Could it be pg_upgrade in > 9.3.4 is buggy. Well, yes, 9.3.4 had a bug fixed by this commit: Author: Bruce Momjian <bruce@momjian.us> Branch: master [a61daa14d] 2014-07-02 15:29:38 -0400 Branch: REL9_4_STABLE [b446a384b] 2014-07-02 15:29:38 -0400 Branch: REL9_3_STABLE Release: REL9_3_5 [3d2e18510] 2014-07-02 15:29:38 -0400 pg_upgrade: preserve database and relation minmxid values =20 Also set these values for pre-9.3 old clusters that don't have values to preserve. =20 Analysis by Alvaro =20 Backpatch through 9.3 > How do we fix the current issue with this one server? Is there an easy >fix? > Thanks. As far as I am aware, you should UPDATE pg_database SET datminmxid=3D20783 and that should fix it. The oldestMulti value in pg_control would get updated by itself some time later. If you experience stalls before oldestMulti fixes itself, you could stop the server (cleanly!) and then pg_resetxlog -m x,y where x is the correct nextMulti value from pg_controldata and y is 20783. --=20 =C1lvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Dinesh Bhandary wrote: > and that should fix it. The oldestMulti value in pg_control would get updated by itself some time later. If you experiencestalls before oldestMulti fixes itself, you could stop the server (cleanly!) and then pg_resetxlog -m x,y wherex is the correct nextMulti value from pg_controldata and y is 20783. > > Are you referreing to NextMultiXactId of current pg_controldata ( from postgres 9.3.5) for x? > There is also NextMultiOffset, not sure If you meant to reset this. Please confirm. No, leave the offset alone. I meant that the "x" value (i.e. the part before the comma) should stay unchanged; you only need to change the "oldest" value, which is "y" (part after the comma). -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Dinesh Bhandary wrote: > I think the fix is working. Now I am able to kill copy job while auto > vacuum job is going on that table. I hope this killing of autovacuum jobs is not continued operational practice. You need whole-table vacuums of all your tables once in a while. (Note you can increase the "freeze_table_age" parameters if you want to be forced to do that less frequently. See the docs.) > Since I am setting pg_resetxlog on the primary, do I need to do that on > the slave too? Please let me know No, the slave should fix itself once the checkpoint record is replayed. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
and that should fix it. The oldestMulti value in pg_control would get upda= ted by itself some time later. If you experience stalls before oldestMulti= fixes itself, you could stop the server (cleanly!) and then pg_resetxlog -= m x,y where x is the correct nextMulti value from pg_controldata and y is 2= 0783. Are you referreing to NextMultiXactId of current pg_controldata ( from post= gres 9.3.5) for x? There is also NextMultiOffset, not sure If you meant to reset this. Please = confirm. Thanks, Dinesh
I think the fix is working. Now I am able to kill copy job while auto vacuum job is going on that table. Since I am setting pg_resetxlog on the primary, do I need to do that on the slave too? Please let me know Thanks. Dinesh On 8/29/14, 12:17 PM, "Alvaro Herrera" <alvherre@2ndquadrant.com> wrote: >Dinesh Bhandary wrote: >> and that should fix it. The oldestMulti value in pg_control would get >>updated by itself some time later. If you experience stalls before >>oldestMulti fixes itself, you could stop the server (cleanly!) and then >>pg_resetxlog -m x,y where x is the correct nextMulti value from >>pg_controldata and y is 20783. >>=20 >> Are you referreing to NextMultiXactId of current pg_controldata ( from >>postgres 9.3.5) for x? >> There is also NextMultiOffset, not sure If you meant to reset this. >>Please confirm. > >No, leave the offset alone. I meant that the "x" value (i.e. the part >before the comma) should stay unchanged; you only need to change the >"oldest" value, which is "y" (part after the comma). > >--=20 >=C1lvaro Herrera http://www.2ndQuadrant.com/ >PostgreSQL Development, 24x7 Support, Training & Services
Dinesh Bhandary wrote: > Hi Alvaro, > > I was killing to reproduce the bug.Once it is fixed we have no need to > kill these jobs. > Now we still have this production server not running in auto vacuum till > we apply the fix during next maintenance window. Is running vacuum > manually safe, or we may run into the same issue as auto vacuum tx > wraparound job? I am thinking max_freeze set to 500 million, manual > vacuum won¹t freeze xids, and it may be ok to run. But I am not completely > sure and worried about the same issue happening in production even with > manual vacuum. Please let me know. Yes, that seems correct. I think you're only running into for-wraparound vacuums because the relfrozenxid for that table is above the 200 millions that IIRC are the default for max_freeze_age. You can just set it to the maximum of 2000 million; the only downside is disk space used by pg_clog and pg_multixact. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Dinesh Bhandary wrote: > What you are saying is vacuum without tx wraparound wonât cause the > blocking issue we had experienced. Yep, that's correct. You're only seeing an issue because vacuum (manual or auto) is trying to freeze the tuple. If the tuple is not considered old enough to need freezing it will not cause the system to stall. Once you get the oldestMulti value fixed, the freezing routine will be able to look at the tuple, notice that the multixact in it is older than the upgrade, and decide it can just rip it out without trying to "resolve" it. > If thatâs the case I can just change the auto vacuum on ( which can be > changed on the fly), and I had already bumped up max_freeze_settings to > 500 mill for now. We will change that to lower number in the next > maintenance window. Right -- AFAICS that should work. -- Ãlvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
V2hhdCB5b3UgYXJlIHNheWluZyBpcyB2YWN1dW0gd2l0aG91dCB0eCB3cmFwYXJvdW5kIHdvbuKA mXQgY2F1c2UgdGhlDQpibG9ja2luZyBpc3N1ZSB3ZSBoYWQgZXhwZXJpZW5jZWQuDQoNCklmIHRo YXTigJlzIHRoZSBjYXNlIEkgY2FuIGp1c3QgY2hhbmdlIHRoZSBhdXRvIHZhY3V1bSBvbiAoIHdo aWNoIGNhbiBiZQ0KY2hhbmdlZCBvbiB0aGUgZmx5KSwgYW5kIEkgaGFkIGFscmVhZHkgYnVtcGVk IHVwIG1heF9mcmVlemVfc2V0dGluZ3MgdG8NCjUwMCBtaWxsIGZvciBub3cuIFdlIHdpbGwgY2hh bmdlIHRoYXQgdG8gbG93ZXIgbnVtYmVyIGluIHRoZSBuZXh0DQptYWludGVuYW5jZSB3aW5kb3cu DQoNClRoYW5rcw0KRGluZXNoDQoNCk9uIDgvMjkvMTQsIDI6MjUgUE0sICJBbHZhcm8gSGVycmVy YSIgPGFsdmhlcnJlQDJuZHF1YWRyYW50LmNvbT4gd3JvdGU6DQoNCj5EaW5lc2ggQmhhbmRhcnkg d3JvdGU6DQo+PiBIaSBBbHZhcm8sDQo+PiANCj4+IEkgd2FzIGtpbGxpbmcgdG8gcmVwcm9kdWNl IHRoZSBidWcuT25jZSBpdCBpcyBmaXhlZCB3ZSBoYXZlIG5vIG5lZWQgdG8NCj4+IGtpbGwgdGhl c2Ugam9icy4NCj4+IE5vdyB3ZSBzdGlsbCBoYXZlIHRoaXMgcHJvZHVjdGlvbiBzZXJ2ZXIgbm90 IHJ1bm5pbmcgaW4gYXV0byB2YWN1dW0gdGlsbA0KPj4gd2UgYXBwbHkgdGhlIGZpeCBkdXJpbmcg bmV4dCBtYWludGVuYW5jZSB3aW5kb3cuIElzIHJ1bm5pbmcgdmFjdXVtDQo+PiBtYW51YWxseSBz YWZlLCBvciB3ZSBtYXkgcnVuIGludG8gdGhlIHNhbWUgaXNzdWUgYXMgYXV0byB2YWN1dW0gdHgN Cj4+IHdyYXBhcm91bmQgam9iPyAgSSBhbSB0aGlua2luZyBtYXhfZnJlZXplIHNldCB0byA1MDAg bWlsbGlvbiwgbWFudWFsDQo+PiB2YWN1dW0gd29uwrl0IGZyZWV6ZSB4aWRzLCBhbmQgaXQgbWF5 IGJlIG9rIHRvIHJ1bi4gQnV0IEkgYW0gbm90DQo+PmNvbXBsZXRlbHkNCj4+IHN1cmUgYW5kIHdv cnJpZWQgYWJvdXQgdGhlIHNhbWUgaXNzdWUgaGFwcGVuaW5nIGluIHByb2R1Y3Rpb24gZXZlbiB3 aXRoDQo+PiBtYW51YWwgdmFjdXVtLiBQbGVhc2UgbGV0IG1lIGtub3cuDQo+DQo+WWVzLCB0aGF0 IHNlZW1zIGNvcnJlY3QuICBJIHRoaW5rIHlvdSdyZSBvbmx5IHJ1bm5pbmcgaW50bw0KPmZvci13 cmFwYXJvdW5kIHZhY3V1bXMgYmVjYXVzZSB0aGUgcmVsZnJvemVueGlkIGZvciB0aGF0IHRhYmxl IGlzIGFib3ZlDQo+dGhlIDIwMCBtaWxsaW9ucyB0aGF0IElJUkMgYXJlIHRoZSBkZWZhdWx0IGZv ciBtYXhfZnJlZXplX2FnZS4gIFlvdSBjYW4NCj5qdXN0IHNldCBpdCB0byB0aGUgbWF4aW11bSBv ZiAyMDAwIG1pbGxpb247IHRoZSBvbmx5IGRvd25zaWRlIGlzIGRpc2sNCj5zcGFjZSB1c2VkIGJ5 IHBnX2Nsb2cgYW5kIHBnX211bHRpeGFjdC4NCj4NCj4tLSANCj7DgWx2YXJvIEhlcnJlcmEgICAg ICAgICAgICAgICAgaHR0cDovL3d3dy4ybmRRdWFkcmFudC5jb20vDQo+UG9zdGdyZVNRTCBEZXZl bG9wbWVudCwgMjR4NyBTdXBwb3J0LCBUcmFpbmluZyAmIFNlcnZpY2VzDQoNCg==
Hi Alvaro, I was killing to reproduce the bug.Once it is fixed we have no need to kill these jobs. Now we still have this production server not running in auto vacuum till we apply the fix during next maintenance window. Is running vacuum manually safe, or we may run into the same issue as auto vacuum tx wraparound job? I am thinking max_freeze set to 500 million, manual vacuum won=B9t freeze xids, and it may be ok to run. But I am not completel= y sure and worried about the same issue happening in production even with manual vacuum. Please let me know. Thanks a lot for your help. Much appreciated. Dinesh On 8/29/14, 2:07 PM, "Alvaro Herrera" <alvherre@2ndquadrant.com> wrote: >Dinesh Bhandary wrote: >> I think the fix is working. Now I am able to kill copy job while auto >> vacuum job is going on that table. > >I hope this killing of autovacuum jobs is not continued operational >practice. You need whole-table vacuums of all your tables once in a >while. (Note you can increase the "freeze_table_age" parameters if you >want to be forced to do that less frequently. See the docs.) > >> Since I am setting pg_resetxlog on the primary, do I need to do that on >> the slave too? Please let me know > >No, the slave should fix itself once the checkpoint record is replayed. > >--=20 >=C1lvaro Herrera http://www.2ndQuadrant.com/ >PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre <at> 2ndquadrant.com> writes: > Well, yes, 9.3.4 had a bug fixed by this commit: > > Author: Bruce Momjian <bruce <at> momjian.us> > Branch: master [a61daa14d] 2014-07-02 15:29:38 -0400 > Branch: REL9_4_STABLE [b446a384b] 2014-07-02 15:29:38 -0400 > Branch: REL9_3_STABLE Release: REL9_3_5 [3d2e18510] 2014-07-02 15:29:38 -0400 > > pg_upgrade: preserve database and relation minmxid values > > Also set these values for pre-9.3 old clusters that don't have values to > preserve. > > Analysis by Alvaro > > Backpatch through 9.3 > > > How do we fix the current issue with this one server? Is there an easy fix? > > Thanks. > > As far as I am aware, you should > UPDATE pg_database SET datminmxid=20783 > > and that should fix it. The oldestMulti value in pg_control would get > updated by itself some time later. If you experience stalls before > oldestMulti fixes itself, you could stop the server (cleanly!) and then > pg_resetxlog -m x,y where x is the correct nextMulti value from > pg_controldata and y is 20783. We ran into this on two instances last night and I'm starting to worry that many others may hit it as well. As far as I can tell any postgres instance that was upgraded to 9.3 by a pre-9.3.5 pg_upgrade is at risk as they all will have: Latest checkpoint's oldestMultiXid: 1 The workaround above is good if you still have the old cluster around from before the upgrade. We did not, that was all cleaned up months ago. Which raises the question: how do you fix a 9.3 instance that has oldestMultiXid = 1 without access to the pre-upgrade instance. That is, where do you get the correct value of oldestMultiXid to set into pg_database.datxminmxid? I took a guess that the oldest pg_class.relminmxid for all the tables in all the databases would be ok and updated pg_database.datminmxid witt that. That is, in each database I ran: select relname, relminmxid, age(relminmxid) from pg_class where relkind = 'r' and age(relminmxid) < 2147483647 order by 3 desc limit 1 ; And then used the oldest one to update pg_database.datminmxid. After a checkpoint and shutdown/restart I see that it was written to pg_controldata too. Afterwards I was able to run a vacuum freeze on the problem table. Questions: 1) Is the procedure above safe and effective for this, or did I just hose my databases? 2) If the procedure above is not safe, what do we do instead? 3) Is this likely to affect all 9.3.x<5 upgraded databases? If so, how do we fix it before too much damage happens in the wider world? -dg -- David Gould daveg <at> sonic.net If simplicity worked, the world would be overrun with insects.
Re: BUG #11264: Auto vacuum wraparound job blocking everything - trying again
From
David Gould
Date:
I sent the following a couple days ago but it seems to have been overlooked. I think this issue may impact a lot of 9.3 sites that have upgraded from earlier versions. Please take a look at this. -dg On Tue, 30 Sep 2014 00:44:11 +0000 (UTC) David Gould <daveg@sonic.net> wrote: > Alvaro Herrera <alvherre <at> 2ndquadrant.com> writes: > > > Well, yes, 9.3.4 had a bug fixed by this commit: > > > > Author: Bruce Momjian <bruce <at> momjian.us> > > Branch: master [a61daa14d] 2014-07-02 15:29:38 -0400 > > Branch: REL9_4_STABLE [b446a384b] 2014-07-02 15:29:38 -0400 > > Branch: REL9_3_STABLE Release: REL9_3_5 [3d2e18510] 2014-07-02 15:29:38 > -0400 > > > > pg_upgrade: preserve database and relation minmxid values > > > > Also set these values for pre-9.3 old clusters that don't have values > to > > preserve. > > > > Analysis by Alvaro > > > > Backpatch through 9.3 > > > > > How do we fix the current issue with this one server? Is there an easy > fix? > > > Thanks. > > > > As far as I am aware, you should > > UPDATE pg_database SET datminmxid=20783 > > > > and that should fix it. The oldestMulti value in pg_control would get > > updated by itself some time later. If you experience stalls before > > oldestMulti fixes itself, you could stop the server (cleanly!) and then > > pg_resetxlog -m x,y where x is the correct nextMulti value from > > pg_controldata and y is 20783. > > We ran into this on two instances last night and I'm starting to worry that > many others may hit it as well. As far as I can tell any postgres instance > that was upgraded to 9.3 by a pre-9.3.5 pg_upgrade is at risk as they all > will have: > > Latest checkpoint's oldestMultiXid: 1 > > The workaround above is good if you still have the old cluster around from > before the upgrade. We did not, that was all cleaned up months ago. Which > raises the question: how do you fix a 9.3 instance that has oldestMultiXid = > 1 without access to the pre-upgrade instance. That is, where do you get the > correct value of oldestMultiXid to set into pg_database.datxminmxid? > > I took a guess that the oldest pg_class.relminmxid for all the tables in all > the databases would be ok and updated pg_database.datminmxid witt that. That > is, in each database I ran: > > select relname, relminmxid, age(relminmxid) > from pg_class > where relkind = 'r' > and age(relminmxid) < 2147483647 > order by 3 desc limit 1 ; > > And then used the oldest one to update pg_database.datminmxid. After a > checkpoint and shutdown/restart I see that it was written to pg_controldata > too. Afterwards I was able to run a vacuum freeze on the problem table. > > Questions: > > 1) Is the procedure above safe and effective for this, or did I just hose my > databases? > > 2) If the procedure above is not safe, what do we do instead? > > 3) Is this likely to affect all 9.3.x<5 upgraded databases? If so, how do we > fix it before too much damage happens in the wider world? \ -- David Gould daveg <at> sonic.net If simplicity worked, the world would be overrun with insects.
David Gould wrote: > We ran into this on two instances last night and I'm starting to worry that > many others may hit it as well. As far as I can tell any postgres instance > that was upgraded to 9.3 by a pre-9.3.5 pg_upgrade is at risk as they all > will have: > > Latest checkpoint's oldestMultiXid: 1 > > The workaround above is good if you still have the old cluster around from > before the upgrade. We did not, that was all cleaned up months ago. Which > raises the question: how do you fix a 9.3 instance that has oldestMultiXid = > 1 without access to the pre-upgrade instance. That is, where do you get the > correct value of oldestMultiXid to set into pg_database.datxminmxid? > > I took a guess that the oldest pg_class.relminmxid for all the tables in all > the databases would be ok and updated pg_database.datminmxid witt that. That > is, in each database I ran: > > select relname, relminmxid, age(relminmxid) > from pg_class > where relkind = 'r' > and age(relminmxid) < 2147483647 > order by 3 desc limit 1 ; Unfortunately, the age() function is only valid for transaction IDs, not multixacts. In certain cases the values returned might be sensible, but they don't have to be, and if the counters are not relatively close to one another, most likely they won't be sensible. A new function was added to 9.4, but that doesn't help those in 9.3. http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=36ad1a87a368d837d5c2260b2fd2ff83e976425a (I guess you could have an extension containing that function.) My bet is that is the reason you had to add the condition that the value is less than 2^31 - 1. Offhand, other than the way to obtain the multixact age, the procedure seems sensible. A better way not involving mxid_age() would be to use pg_controldata to extract the current value of the mxid counter, then subtract the current relminmxid from that value. > 3) Is this likely to affect all 9.3.x<5 upgraded databases? Yes. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 01.10.2014 23:32, Alvaro Herrera wrote: > David Gould wrote: > >> We ran into this on two instances last night and I'm starting to worry that >> many others may hit it as well. As far as I can tell any postgres instance >> that was upgraded to 9.3 by a pre-9.3.5 pg_upgrade is at risk as they all >> will have: >> >> Latest checkpoint's oldestMultiXid: 1 >> >> The workaround above is good if you still have the old cluster around from >> before the upgrade. We did not, that was all cleaned up months ago. Which >> raises the question: how do you fix a 9.3 instance that has oldestMultiXid = >> 1 without access to the pre-upgrade instance. That is, where do you get the >> correct value of oldestMultiXid to set into pg_database.datxminmxid? >> >> I took a guess that the oldest pg_class.relminmxid for all the tables in all >> the databases would be ok and updated pg_database.datminmxid witt that. That >> is, in each database I ran: >> >> select relname, relminmxid, age(relminmxid) >> from pg_class >> where relkind = 'r' >> and age(relminmxid) < 2147483647 >> order by 3 desc limit 1 ; > Unfortunately, the age() function is only valid for transaction IDs, not > multixacts. In certain cases the values returned might be sensible, but > they don't have to be, and if the counters are not relatively close to > one another, most likely they won't be sensible. A new function was > added to 9.4, but that doesn't help those in 9.3. > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=36ad1a87a368d837d5c2260b2fd2ff83e976425a > (I guess you could have an extension containing that function.) > My bet is that is the reason you had to add the condition that the value > is less than 2^31 - 1. > > Offhand, other than the way to obtain the multixact age, the procedure > seems sensible. > > A better way not involving mxid_age() would be to use pg_controldata to > extract the current value of the mxid counter, then subtract the current > relminmxid from that value. > >> 3) Is this likely to affect all 9.3.x<5 upgraded databases? > Yes. > Hi, I just have 700 databases which could been affected and i just don't want to dump and restore all of them, so is there a prober way to detect if one is affected or not? E.g. vacuum of all tables or something else? And is a dump, init, restore a way to make 100% sure this bug is fixed? regards Bernhard Schrader