Thread: ERROR: MultiXactId XXXXX has not been created yet -- apparent wraparound

ERROR: MultiXactId XXXXX has not been created yet -- apparent wraparound

From
Ioana Danes
Date:
Hello All,

I am running postgres 9.4.8 on centos 7 and few days ago I started to get this error MultiXactId 32766 has not been created yet -- apparent wraparound in 2 instances.

1. On one database server during the nightly task that does the "vacuum analyze". I upgraded to postgres 9.4.9 as there is a reference to a bug fixed with a reference to this error but I am still getting the same error when I vacuum analyze. This database was created using pg_dump and pg_restore (not pg_upgrade) from the previous installation which was still postgres 9.4.6 but on SLES instead of CentOS. I only have one postgres version installed on the server.

Autovacuum is turned on and I also have a task that runs vacuum analyze every night.

postgresql94-9.4.9-1PGDG.rhel7.x86_64
postgresql94-plperl-9.4.9-1PGDG.rhel7.x86_64
postgresql94-libs-9.4.9-1PGDG.rhel7.x86_64
postgresql94-server-9.4.9-1PGDG.rhel7.x86_64
postgresql94-contrib-9.4.9-1PGDG.rhel7.x86_64

psql (9.4.9)
=# vacuum analyze;
ERROR:  MultiXactId 32766 has not been created yet -- apparent wraparound

This server only runs since June 2016.

2. The second instance of this error is on a new database server during pg_restore on few CREATE INDEX, PRIMARY KEYS and FOREIGN KEYS statements (postgres 9.4.8 on centos 7). In this case I dropped and recreated the database and the second try succeeded without errors.

postgresql-14.csv:2016-08-14 05:33:59.753 CST,"postgres","test",19555,"[local]",57b055d4.4c63,11,"CREATE INDEX",2016-08-14 05:28:20 CST,1/27230,5381,ERROR,XX000,"MultiXactId 1667854355 has not been created yet -- apparent wraparound",,,,,,"CREATE INDEX...;


Please let me know if I should provide more info.

Thank you in advance,
Ioana






Re: ERROR: MultiXactId XXXXX has not been created yet -- apparent wraparound

From
Adrian Klaver
Date:
On 08/15/2016 06:48 AM, Ioana Danes wrote:
> Hello All,
>
> I am running postgres 9.4.8 on centos 7 and few days ago I started to
> get this error MultiXactId 32766 has not been created yet -- apparent
> wraparound in 2 instances.
>
> 1. On one database server during the nightly task that does the "vacuum
> analyze". I upgraded to postgres 9.4.9 as there is a reference to a bug
> fixed with a reference to this error but I am still getting the same
> error when I vacuum analyze. This database was created using pg_dump and
> pg_restore (not pg_upgrade) from the previous installation which was
> still postgres 9.4.6 but on SLES instead of CentOS. I only have one
> postgres version installed on the server.
>
> Autovacuum is turned on and I also have a task that runs vacuum analyze
> every night.
>
> postgresql94-9.4.9-1PGDG.rhel7.x86_64
> postgresql94-plperl-9.4.9-1PGDG.rhel7.x86_64
> postgresql94-libs-9.4.9-1PGDG.rhel7.x86_64
> postgresql94-server-9.4.9-1PGDG.rhel7.x86_64
> postgresql94-contrib-9.4.9-1PGDG.rhel7.x86_64
>
> psql (9.4.9)
> =# vacuum analyze;
> ERROR:  MultiXactId 32766 has not been created yet -- apparent wraparound
>
> This server only runs since June 2016.
>
> 2. The second instance of this error is on a new database server during
> pg_restore on few CREATE INDEX, PRIMARY KEYS and FOREIGN KEYS statements
> (postgres 9.4.8 on centos 7). In this case I dropped and recreated the
> database and the second try succeeded without errors.
>
> postgresql-14.csv:2016-08-14 05:33:59.753
> CST,"postgres","test",19555,"[local]",57b055d4.4c63,11,"CREATE
> INDEX",2016-08-14 05:28:20 CST,1/27230,5381,ERROR,XX000,"MultiXactId
> 1667854355 has not been created yet -- apparent wraparound",,,,,,"CREATE
> INDEX...;
>
>
> Please let me know if I should provide more info.

Are these the same VM's as in your previous error(Corrupted Data) post?

Is the first error you mention on the db3 server from the previous
error(Corrupted Data)?

>
> Thank you in advance,
> Ioana
>
>
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: ERROR: MultiXactId XXXXX has not been created yet -- apparent wraparound

From
Ioana Danes
Date:
Hello Adrian,

On Mon, Aug 15, 2016 at 10:00 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/15/2016 06:48 AM, Ioana Danes wrote:
Hello All,

I am running postgres 9.4.8 on centos 7 and few days ago I started to
get this error MultiXactId 32766 has not been created yet -- apparent
wraparound in 2 instances.

1. On one database server during the nightly task that does the "vacuum
analyze". I upgraded to postgres 9.4.9 as there is a reference to a bug
fixed with a reference to this error but I am still getting the same
error when I vacuum analyze. This database was created using pg_dump and
pg_restore (not pg_upgrade) from the previous installation which was
still postgres 9.4.6 but on SLES instead of CentOS. I only have one
postgres version installed on the server.

Autovacuum is turned on and I also have a task that runs vacuum analyze
every night.

postgresql94-9.4.9-1PGDG.rhel7.x86_64
postgresql94-plperl-9.4.9-1PGDG.rhel7.x86_64
postgresql94-libs-9.4.9-1PGDG.rhel7.x86_64
postgresql94-server-9.4.9-1PGDG.rhel7.x86_64
postgresql94-contrib-9.4.9-1PGDG.rhel7.x86_64

psql (9.4.9)
=# vacuum analyze;
ERROR:  MultiXactId 32766 has not been created yet -- apparent wraparound

This server only runs since June 2016.

2. The second instance of this error is on a new database server during
pg_restore on few CREATE INDEX, PRIMARY KEYS and FOREIGN KEYS statements
(postgres 9.4.8 on centos 7). In this case I dropped and recreated the
database and the second try succeeded without errors.

postgresql-14.csv:2016-08-14 05:33:59.753
CST,"postgres","test",19555,"[local]",57b055d4.4c63,11,"CREATE
INDEX",2016-08-14 05:28:20 CST,1/27230,5381,ERROR,XX000,"MultiXactId
1667854355 has not been created yet -- apparent wraparound",,,,,,"CREATE
INDEX...;


Please let me know if I should provide more info.

Are these the same VM's as in your previous error(Corrupted Data) post?

Is the first error you mention on the db3 server from the previous error(Corrupted Data)?

 
They are not the same servers but they have similar setup. Fortunately they are qa and development.

On the same servers (qa) I also started to get some corruption messages on pg_dump.

2016-08-15 00:04:17.238 CST,"postgres","abrazo",7600,"[local]",57b15a62.1db0,6,"COPY",2016-08-15 00:00:02 CST,13/6895726,0,ERROR,XX000,"compressed data is corrupt",,,,,,"COPY ... TO stdout;",,,"pg_dump"

Also yesterday I dropped the table that had the problem in the previous post and restored it from db1 (the good database) and all seemed good but few hours after I fixed that table, db4 that is a the PITR slave of db3 started in production with a checksum error...
 
All the problems started to appear when we switched to CentOS with kvm. I have instances on postgres 9.4 in production and qa on SLES with XEN since few years ago and never had any kind of issues, and that would be around 70 database servers. We also run on postgres since 12 years ago without major problems...


Thank you very much for your help,
ioana
 

Thank you in advance,
Ioana








--
Adrian Klaver
adrian.klaver@aklaver.com

Re: ERROR: MultiXactId XXXXX has not been created yet -- apparent wraparound

From
Adrian Klaver
Date:
On 08/15/2016 07:40 AM, Ioana Danes wrote:
> Hello Adrian,
>
> On Mon, Aug 15, 2016 at 10:00 AM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 08/15/2016 06:48 AM, Ioana Danes wrote:
>
>         Hello All,
>
>         I am running postgres 9.4.8 on centos 7 and few days ago I
>         started to
>         get this error MultiXactId 32766 has not been created yet --
>         apparent
>         wraparound in 2 instances.
>
>         1. On one database server during the nightly task that does the
>         "vacuum
>         analyze". I upgraded to postgres 9.4.9 as there is a reference
>         to a bug
>         fixed with a reference to this error but I am still getting the same
>         error when I vacuum analyze. This database was created using
>         pg_dump and
>         pg_restore (not pg_upgrade) from the previous installation which was
>         still postgres 9.4.6 but on SLES instead of CentOS. I only have one
>         postgres version installed on the server.
>
>         Autovacuum is turned on and I also have a task that runs vacuum
>         analyze
>         every night.
>
>         postgresql94-9.4.9-1PGDG.rhel7.x86_64
>         postgresql94-plperl-9.4.9-1PGDG.rhel7.x86_64
>         postgresql94-libs-9.4.9-1PGDG.rhel7.x86_64
>         postgresql94-server-9.4.9-1PGDG.rhel7.x86_64
>         postgresql94-contrib-9.4.9-1PGDG.rhel7.x86_64
>
>         psql (9.4.9)
>         =# vacuum analyze;
>         ERROR:  MultiXactId 32766 has not been created yet -- apparent
>         wraparound
>
>         This server only runs since June 2016.
>
>         2. The second instance of this error is on a new database server
>         during
>         pg_restore on few CREATE INDEX, PRIMARY KEYS and FOREIGN KEYS
>         statements
>         (postgres 9.4.8 on centos 7). In this case I dropped and
>         recreated the
>         database and the second try succeeded without errors.
>
>         postgresql-14.csv:2016-08-14 05:33:59.753
>         CST,"postgres","test",19555,"[local]",57b055d4.4c63,11,"CREATE
>         INDEX",2016-08-14 05:28:20 CST,1/27230,5381,ERROR,XX000,"MultiXactId
>         1667854355 has not been created yet -- apparent
>         wraparound",,,,,,"CREATE
>         INDEX...;
>
>
>         Please let me know if I should provide more info.
>
>
>     Are these the same VM's as in your previous error(Corrupted Data) post?
>
>     Is the first error you mention on the db3 server from the previous
>     error(Corrupted Data)?
>
>
> They are not the same servers but they have similar setup. Fortunately
> they are qa and development.

Should have asked previously, are they on the same host machine?

Basically, something is corrupting data, just trying to narrow it down
to hardware or software. Looking to figure out if the problem(s) follow
the physical machine or the software the Postgres is running on.

>
> On the same servers (qa) I also started to get some corruption messages
> on pg_dump.
>
> 2016-08-15 00:04:17.238
> CST,"postgres","abrazo",7600,"[local]",57b15a62.1db0,6,"COPY",2016-08-15
> 00:00:02 CST,13/6895726,0,ERROR,XX000,"compressed data is
> corrupt",,,,,,"COPY ... TO stdout;",,,"pg_dump"
>
> Also yesterday I dropped the table that had the problem in the previous
> post and restored it from db1 (the good database) and all seemed good
> but few hours after I fixed that table, db4 that is a the PITR slave of
> db3 started in production with a checksum error...
>
> All the problems started to appear when we switched to CentOS with kvm.

On the same machines or new/different machines?

> I have instances on postgres 9.4 in production and qa on SLES with XEN
> since few years ago and never had any kind of issues, and that would be
> around 70 database servers. We also run on postgres since 12 years ago
> without major problems...
>
>
> Thank you very much for your help,
> ioana
>
>
>
>         Thank you in advance,
>         Ioana
>
>
>
>
>
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: ERROR: MultiXactId XXXXX has not been created yet -- apparent wraparound

From
Ioana Danes
Date:


On Mon, Aug 15, 2016 at 10:52 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/15/2016 07:40 AM, Ioana Danes wrote:
Hello Adrian,

On Mon, Aug 15, 2016 at 10:00 AM, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

    On 08/15/2016 06:48 AM, Ioana Danes wrote:

        Hello All,

        I am running postgres 9.4.8 on centos 7 and few days ago I
        started to
        get this error MultiXactId 32766 has not been created yet --
        apparent
        wraparound in 2 instances.

        1. On one database server during the nightly task that does the
        "vacuum
        analyze". I upgraded to postgres 9.4.9 as there is a reference
        to a bug
        fixed with a reference to this error but I am still getting the same
        error when I vacuum analyze. This database was created using
        pg_dump and
        pg_restore (not pg_upgrade) from the previous installation which was
        still postgres 9.4.6 but on SLES instead of CentOS. I only have one
        postgres version installed on the server.

        Autovacuum is turned on and I also have a task that runs vacuum
        analyze
        every night.

        postgresql94-9.4.9-1PGDG.rhel7.x86_64
        postgresql94-plperl-9.4.9-1PGDG.rhel7.x86_64
        postgresql94-libs-9.4.9-1PGDG.rhel7.x86_64
        postgresql94-server-9.4.9-1PGDG.rhel7.x86_64
        postgresql94-contrib-9.4.9-1PGDG.rhel7.x86_64

        psql (9.4.9)
        =# vacuum analyze;
        ERROR:  MultiXactId 32766 has not been created yet -- apparent
        wraparound

        This server only runs since June 2016.

        2. The second instance of this error is on a new database server
        during
        pg_restore on few CREATE INDEX, PRIMARY KEYS and FOREIGN KEYS
        statements
        (postgres 9.4.8 on centos 7). In this case I dropped and
        recreated the
        database and the second try succeeded without errors.

        postgresql-14.csv:2016-08-14 05:33:59.753
        CST,"postgres","test",19555,"[local]",57b055d4.4c63,11,"CREATE
        INDEX",2016-08-14 05:28:20 CST,1/27230,5381,ERROR,XX000,"MultiXactId
        1667854355 has not been created yet -- apparent
        wraparound",,,,,,"CREATE
        INDEX...;


        Please let me know if I should provide more info.


    Are these the same VM's as in your previous error(Corrupted Data) post?

    Is the first error you mention on the db3 server from the previous
    error(Corrupted Data)?


They are not the same servers but they have similar setup. Fortunately
they are qa and development.

Should have asked previously, are they on the same host machine?

No they are on different physical machines.
 
Basically, something is corrupting data, just trying to narrow it down to hardware or software. Looking to figure out if the problem(s) follow the physical machine or the software the Postgres is running on.
I would say software because on the same physical machines we had SLES + XEN running for years without any kind of corruption. 


On the same servers (qa) I also started to get some corruption messages
on pg_dump.

2016-08-15 00:04:17.238
CST,"postgres","abrazo",7600,"[local]",57b15a62.1db0,6,"COPY",2016-08-15
00:00:02 CST,13/6895726,0,ERROR,XX000,"compressed data is
corrupt",,,,,,"COPY ... TO stdout;",,,"pg_dump"

Also yesterday I dropped the table that had the problem in the previous
post and restored it from db1 (the good database) and all seemed good
but few hours after I fixed that table, db4 that is a the PITR slave of
db3 started in production with a checksum error...

All the problems started to appear when we switched to CentOS with kvm.

On the same machines or new/different machines?

I have instances on postgres 9.4 in production and qa on SLES with XEN
since few years ago and never had any kind of issues, and that would be
around 70 database servers. We also run on postgres since 12 years ago
without major problems...


Thank you very much for your help,
ioana



        Thank you in advance,
        Ioana








    --
    Adrian Klaver
    adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>




--
Adrian Klaver
adrian.klaver@aklaver.com

Re: ERROR: MultiXactId XXXXX has not been created yet -- apparent wraparound

From
Adrian Klaver
Date:
On 08/15/2016 08:05 AM, Ioana Danes wrote:
>
>
> On Mon, Aug 15, 2016 at 10:52 AM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>
>
>
>         They are not the same servers but they have similar setup.
>         Fortunately
>         they are qa and development.
>
>
>     Should have asked previously, are they on the same host machine?
>
> No they are on different physical machines.
>
>
>     Basically, something is corrupting data, just trying to narrow it
>     down to hardware or software. Looking to figure out if the
>     problem(s) follow the physical machine or the software the Postgres
>     is running on.
>
> I would say software because on the same physical machines we had SLES +
> XEN running for years without any kind of corruption.
>

I would tend to agree, but one cannot discount the effects of age. Could
be that the OS switch happened just prior to some age induced failure in
hardware.

Still, do you have a running Postgres instance or have you tried running
a Postgres instance on CentOS that is not in a VM?


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: ERROR: MultiXactId XXXXX has not been created yet -- apparent wraparound

From
Ioana Danes
Date:


On Mon, Aug 15, 2016 at 11:40 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/15/2016 08:05 AM, Ioana Danes wrote:


On Mon, Aug 15, 2016 at 10:52 AM, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:




        They are not the same servers but they have similar setup.
        Fortunately
        they are qa and development.


    Should have asked previously, are they on the same host machine?

No they are on different physical machines.


    Basically, something is corrupting data, just trying to narrow it
    down to hardware or software. Looking to figure out if the
    problem(s) follow the physical machine or the software the Postgres
    is running on.

I would say software because on the same physical machines we had SLES +
XEN running for years without any kind of corruption.


I would tend to agree, but one cannot discount the effects of age. Could be that the OS switch happened just prior to some age induced failure in hardware.
 
Still, do you have a running Postgres instance or have you tried running a Postgres instance on CentOS that is not in a VM?

Thank you for all your insights, it's been very helpful. I think the best strategy is to start testing each component, changing one component at a time...

I will be back with the result of our tests.

Thanks again for your time and help,
ioana 


--
Adrian Klaver
adrian.klaver@aklaver.com