Thread: Error XX000 After pg11 upgrade

Error XX000 After pg11 upgrade

From
Simon Windsor
Date:
Hi

I have just upgraded a Db from pg 10.5 to pg11.4 and almost immediately we are seeing errors like

ERROR,XX000,"cache lookup failed for type 22079"

When inserting into a a partitioned table. About 30% of inserts are reporting this error.

How do I determine what type 22079 refers to? I have checked pg_attributes and pg_classes without success? What would be causing this error?

The basic table layout is

objects
    objects_2010
    objects_2011
-----
    objects_2018
    objects_2019

And an insert trigger uses the current date to dertmine where the object is stored.

Any ideas or suggestions would be appreciated.

Simon
--
Simon

Simon Windsor
Eml: simon.windsor@cornfield.org.uk
Tel: 01454 617689
Mob: 07960 321599

Re: Error XX000 After pg11 upgrade

From
Tom Lane
Date:
Simon Windsor <simon.windsor@cornfield.me.uk> writes:
> I have just upgraded a Db from pg 10.5 to pg11.4 and almost immediately we
> are seeing errors like
> ERROR,XX000,"cache lookup failed for type 22079"
> When inserting into a a partitioned table. About 30% of inserts are
> reporting this error.

Hmm ... can you show the full schema (eg, psql \d+ output) for the
partitioned table and its children?  Is there any pattern to the
failing inserts, eg do they all resolve as inserts to the same
partition(s)?

> ... And an insert trigger uses the current date to dertmine where the
> object is stored.

In other words, this isn't actual partitioning as introduced in v10,
but a hand-rolled equivalent?

> How do I determine what type 22079 refers to?

Presumably, the problem is that that type OID *doesn't* refer to anything
any more.  You should be asking "where is this dangling reference coming
from?".  It's possibly hiding in the partitioning expression(s) of
this partitioned table, but there's no way to tell with this amount
of info.

How did you do the upgrade exactly?

            regards, tom lane



Re: Error XX000 After pg11 upgrade

From
Simon Windsor
Date:
Hi

The Full system used to in an Oracle DB and was ported to Postgres 9.5 
about 2+ years ago, and partitioned using inheritance tables.

Since then pg_upgrade has been used to upgrade to pg10 (with apt upgrade 
to take to 10.5 occasionally).

Last week, pg_upgrade was againn used to upgrade to pg11.4.

Since then, large bulk inserts of configuration changes are failing with 
this Error, but adhoc and small changes are working ok.

The actual error is reported by a Java process

Caused by: org.postgresql.util.PSQLException: ERROR: cache lookup failed 
for type 22079
   Where: SQL statement "insert into configObjectsFull_2019 values (new.*)"
PL/pgSQL function configobjectsfull_insert_trigger() line 28 at SQL 
statement
     at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2310)
     at

The DB Tableand Trigger are attached.

As you can see, the id and timestamp are set on insert, and are used to 
determine the partition used.

This issue started after the upgrade to pg11, pg10 and pg9 had no problems.

Any ideas would be appreciated.

Simon

On 15/08/2019 16:31, Tom Lane wrote:
> Simon Windsor <simon.windsor@cornfield.me.uk> writes:
>> I have just upgraded a Db from pg 10.5 to pg11.4 and almost immediately we
>> are seeing errors like
>> ERROR,XX000,"cache lookup failed for type 22079"
>> When inserting into a a partitioned table. About 30% of inserts are
>> reporting this error.
> Hmm ... can you show the full schema (eg, psql \d+ output) for the
> partitioned table and its children?  Is there any pattern to the
> failing inserts, eg do they all resolve as inserts to the same
> partition(s)?
>
>> ... And an insert trigger uses the current date to dertmine where the
>> object is stored.
> In other words, this isn't actual partitioning as introduced in v10,
> but a hand-rolled equivalent?
>
>> How do I determine what type 22079 refers to?
> Presumably, the problem is that that type OID *doesn't* refer to anything
> any more.  You should be asking "where is this dangling reference coming
> from?".  It's possibly hiding in the partitioning expression(s) of
> this partitioned table, but there's no way to tell with this amount
> of info.
>
> How did you do the upgrade exactly?
>
>             regards, tom lane

-- 
Simon Windsor

Eml: simon.windsor@cornfield.me.uk
Tel: 01454 617689
Mob: 0755 197 9733


“There is nothing in the world that some man cannot make a little worse and sell a little cheaper, and he who considers
priceonly is that man's lawful prey.”
 


Attachment

Re: Error XX000 After pg11 upgrade

From
Gavin Flower
Date:
On 16/08/2019 04:35, Simon Windsor wrote:
> Hi
>
> The Full system used to in an Oracle DB and was ported to Postgres 9.5 
> about 2+ years ago, and partitioned using inheritance tables.
>
> Since then pg_upgrade has been used to upgrade to pg10 (with apt 
> upgrade to take to 10.5 occasionally).
>
> Last week, pg_upgrade was againn used to upgrade to pg11.4.
>
> Since then, large bulk inserts of configuration changes are failing 
> with this Error, but adhoc and small changes are working ok.
>
> The actual error is reported by a Java process
>
> Caused by: org.postgresql.util.PSQLException: ERROR: cache lookup 
> failed for type 22079
>   Where: SQL statement "insert into configObjectsFull_2019 values 
> (new.*)"
> PL/pgSQL function configobjectsfull_insert_trigger() line 28 at SQL 
> statement
>     at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2310)
>     at
>
> The DB Tableand Trigger are attached.
>
> As you can see, the id and timestamp are set on insert, and are used 
> to determine the partition used.
>
> This issue started after the upgrade to pg11, pg10 and pg9 had no 
> problems.
>
> Any ideas would be appreciated.


[...]

Probably will make no difference, but have you considered testing using 
pg 11.5?  It would at least rule out a lot of Red Herrings!


Cheers,
Gavin





Re: Error XX000 After pg11 upgrade

From
Tom Lane
Date:
Simon Windsor <simon.windsor@cornfield.me.uk> writes:
> Since then, large bulk inserts of configuration changes are failing with 
> this Error, but adhoc and small changes are working ok.

Might it be that things work as long as the trigger is only tasked with
redirecting to the same child table (or limited set of child tables)
within a particular insertion command?

I'm wondering if this could be related to bug #15913 --- which I just
fixed today, so maybe I just have it on the brain too much.  The
manifestation doesn't look quite the same, but given the way your
trigger is written, something about NEW.* changing type from one
call to the next might have something to do with it.

I also wonder how often you create/delete child tables.

            regards, tom lane



Re: Error XX000 After pg11 upgrade

From
Simon Windsor
Date:
Hi

Thanks for all the help, and a couple of offlist suggestions.

We have fixed the problem by copying all of the data (160GB) to a partitioned table, replacing the trigger with table column defaults for timestamp and sequence values.

As a result, all is working ok.

Thank you, once again

Simon

On Fri, 16 Aug 2019 at 01:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Windsor <simon.windsor@cornfield.me.uk> writes:
> Since then, large bulk inserts of configuration changes are failing with
> this Error, but adhoc and small changes are working ok.

Might it be that things work as long as the trigger is only tasked with
redirecting to the same child table (or limited set of child tables)
within a particular insertion command?

I'm wondering if this could be related to bug #15913 --- which I just
fixed today, so maybe I just have it on the brain too much.  The
manifestation doesn't look quite the same, but given the way your
trigger is written, something about NEW.* changing type from one
call to the next might have something to do with it.

I also wonder how often you create/delete child tables.

                        regards, tom lane


--
Simon

Simon Windsor
Eml: simon.windsor@cornfield.org.uk
Tel: 01454 617689
Mob: 07960 321599