Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated) - Mailing list pgsql-bugs

From Amit Kapila
Subject Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
Date
Msg-id CAA4eK1+_pXTZZ837TdAnFmhdXLQTM3oxPyCmUo1VQcWOygn3CQ@mail.gmail.com
Whole thread Raw
In response to Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-bugs
On Tue, Apr 28, 2015 at 11:24 PM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
>
> Alvaro Herrera wrote:
>
>
> Pushed.  I chose find_multixact_start() as a name for this function.
>

I have done test to ensure that the latest change has fixed the
reported problem and below are the results, to me it looks the
reported problem is fixed.

I have used test (explode_mxact_members) developed by Thomas
to reproduce the problem.  Start one transaction in a session.
After running the test for 3~4 hours with parameters as
explode_mxact_members 500 35000, I could see the warning messages
like below (before the fix there were no such messages and test is
completed but it has corrupted the database):

WARNING:  database with OID 1 must be vacuumed before 358 more multixact
members are used
HINT:  Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING:  database with OID 1 must be vacuumed before 310 more multixact
members are used
HINT:  Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING:  database with OID 1 must be vacuumed before 261 more multixact
members are used
HINT:  Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING:  database with OID 1 must be vacuumed before 211 more multixact
members are used
HINT:  Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING:  database with OID 1 must be vacuumed before 160 more multixact
members are used
HINT:  Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
explode_mxact_members: explode_mxact_members.c:38: main: Assertion
`PQresultStatus(res) == PGRES_TUPLES_OK'
failed.

After this I  set the vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age as zero and then performed
Vacuum freeze for template1 and postgres followed by
manual CHECKPOINT.  I could see below values in pg_database.

postgres=# select oid,datname,datminmxid from pg_database;
  oid  |  datname  | datminmxid
-------+-----------+------------
     1 | template1 |   17111262
 13369 | template0 |   17111262
 13374 | postgres  |   17111262
(3 rows)

Again I start the test as ./explode_mxact_members 500 35000, but it
immediately failed as
500 sessions connected...
Loop 0...
WARNING:  database with OID 13369 must be vacuumed before 12 more multixact
members are used
HINT:  Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING:  database with OID 13369 must be vacuumed before 11 more multixact
members are used
HINT:  Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING:  database with OID 13369 must be vacuumed before 9 more multixact
members are used
HINT:  Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
explode_mxact_members: explode_mxact_members.c:38: main: Assertion
`PQresultStatus(res) == PGRES_TUPLES_OK'
failed.

Now it was confusing for me why it has failed for next time even
though I had Vacuum Freeze and CHECKPOINT, but then I waited
for a minute or two and ran Vacuum Freeze by below command:
./vacuumdb -a -F
vacuumdb: vacuuming database "postgres"
vacuumdb: vacuuming database "template1"

Here I have verified that all files except one were deleted.

After that when I restarted the test, it went perfectly fine and it never
lead to any warning messages, probable because the values for
vacuum_multixact_freeze_min_age and vacuum_multixact_freeze_table_age
were zero.

I am still not sure why it took some time to clean the members directory
and resume the test after running Vacuum Freeze and Checkpoint.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Failure to coerce unknown type to specific type
Next
From: Thomas Munro
Date:
Subject: Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)