Thread: Could not open relation with OID (table partitioning issue?)

Could not open relation with OID (table partitioning issue?)

From
"mark"
Date:

Hi all

 

Running Postgres 8.3.7

 

Are there any known issues with table partitioning and transactions having a child partition getting removed out from under running queries?

 

 

I got an error in my log about not being able to open a relation with OID XXXXX from a SELECT statement that ran about the same time that a cron job may have removed some of the older table partitions. (that may or may not have been visible to select query)

 

Right now I have been checking but I can’t find anything wrong with the database so it doesn’t look like I have any db corruption issues or the like currently.  there is some hate in the logs about it for a while and then the database was restarted.

 

My best guess is that a the clean up of old partitions yanked a table out from view…. Kind of like when you run a \d at the same time a table is dropped.

 

Thoughts? Comments? Ideas ?

 

 

 

-Mark

 

 

Re: Could not open relation with OID (table partitioning issue?)

From
peter@vfemail.net
Date:
At 09:22 AM 11/15/2010, mark wrote:
>Hi all
>
>Running Postgres 8.3.7
>
>Are there any known issues with table partitioning and transactions having a child partition getting removed out from
underrunning queries? 
>
>
>I got an error in my log about not being able to open a relation with OID XXXXX from a SELECT statement that ran about
thesame time that a cron job may have removed some of the older table partitions. (that may or may not have been
visibleto select query) 
>
>Right now I have been checking but I can’t find anything wrong with the database so it doesn’t look like I have any db
corruptionissues or the like currently.  there is some hate in the logs about it for a while and then the database was
restarted. 
>
>My best guess is that a the clean up of old partitions yanked a table out from view…. Kind of like when you run a \d
atthe same time a table is dropped.  
>
>Thoughts? Comments? Ideas ?

I started seeing these frightening messages a couple of months ago:

   pg_dump: ERROR:  could not open relation with OID 2196359751
   pg_dump: SQL command to dump the contents of table "xyz" failed: PQendcopy() failed.
   pg_dump: Error message from server: ERROR:  could not open relation with OID 2196359751
   pg_dump: The command was: COPY public.xyz ({various_field_names}) TO stdout;

and the error was causing a daily database backup routine to fail.  Because the problem was too far beyond my ability
tosolve, I hired Frank Heikens at http://nl.linkedin.com/pub/frank-heikens/0/190/517 and he got everything back to
normalin a day or two.  Mr. Heikens isolated the one corrupted data record, created a new table, and replaced the
flawedtable with the new table.  I have nothing but compliments for Mr. Heikens' knowledge, professionalism, speed,
accuracy,caution, communication, and wizardry.   











-------------------------------------------------
This message sent via VFEmail.net
http://www.vfemail.net
$14.95 Lifetime accounts - 1GB disk, No bandwidth quotas!


Re: Could not open relation with OID (table partitioning issue?)

From
"mark"
Date:
Our backups are fine and don't have a problem creating or restoring to/from
them. It's just that it appears the create/drop function we have is dropping
child partitions out from under running queries, and I thought that MVCC
would handle this...

Right now I am working up to trying to create a self contained case that
people could use to reproduce the issue for the bugs list.


-Mark




-----Original Message-----
From: pgsql-novice-owner@postgresql.org
[mailto:pgsql-novice-owner@postgresql.org] On Behalf Of peter@vfemail.net
Sent: Monday, November 15, 2010 7:47 AM
To: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Could not open relation with OID (table partitioning
issue?)


At 09:22 AM 11/15/2010, mark wrote:
>Hi all
>
>Running Postgres 8.3.7
>
>Are there any known issues with table partitioning and transactions having
a child partition getting removed out from under running queries?
>
>
>I got an error in my log about not being able to open a relation with OID
XXXXX from a SELECT statement that ran about the same time that a cron job
may have removed some of the older table partitions. (that may or may not
have been visible to select query)
>
>Right now I have been checking but I can't find anything wrong with the
database so it doesn't look like I have any db corruption issues or the like
currently.  there is some hate in the logs about it for a while and then the
database was restarted.
>
>My best guess is that a the clean up of old partitions yanked a table out
from view.. Kind of like when you run a \d at the same time a table is
dropped.
>
>Thoughts? Comments? Ideas ?

I started seeing these frightening messages a couple of months ago:

   pg_dump: ERROR:  could not open relation with OID 2196359751
   pg_dump: SQL command to dump the contents of table "xyz" failed:
PQendcopy() failed.
   pg_dump: Error message from server: ERROR:  could not open relation with
OID 2196359751
   pg_dump: The command was: COPY public.xyz ({various_field_names}) TO
stdout;

and the error was causing a daily database backup routine to fail.  Because
the problem was too far beyond my ability to solve, I hired Frank Heikens at
http://nl.linkedin.com/pub/frank-heikens/0/190/517 and he got everything
back to normal in a day or two.  Mr. Heikens isolated the one corrupted data
record, created a new table, and replaced the flawed table with the new
table.  I have nothing but compliments for Mr. Heikens' knowledge,
professionalism, speed, accuracy, caution, communication, and wizardry.











-------------------------------------------------
This message sent via VFEmail.net
http://www.vfemail.net
$14.95 Lifetime accounts - 1GB disk, No bandwidth quotas!


--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice


Re: Could not open relation with OID (table partitioning issue?)

From
bakkiya
Date:
Hi.
We are also facing similar problem with postgresql partition tables. We are
querying 2 days old partition table and at the same time we are dropping 2
months old partition.Sometimes our queries are getting failed with "ERROR:
could not open relation with OID 1761740". We have set the constraint
exclusion to on. Can you please help us on resolving it?

Regards,
Bakki
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Could-not-open-relation-with-OID-table-partitioning-issue-tp3265708p3272358.html
Sent from the PostgreSQL - novice mailing list archive at Nabble.com.

Re: Could not open relation with OID (table partitioning issue?)

From
"mark"
Date:
What version of PG are you using? (just wondering so I can know if this
another 8.3 or something newer)

Are you using cursors ? (again just wondering)


I still haven't been able to reproduce in our dev machines only seeing this
issue in our production servers. (that handle several orders of magnitude
more traffic and data then I can generate in our dev servers)


Right now I am working on getting a patch to our functions at various db
servers that drops tables to first get an LOCK TABLE <parent table> IN
ACCESS EXCLUSIVE MODE before dropping child tables.

It's kind of a heavy hammer to have to use on a high transaction
environment, but I am hoping this will eliminate the problem in our
environment.

I probably won't know for a month if this will fix the issue or not, as that
is the pace that we move at. If try it I would like to hear back.


- Mark


-----Original Message-----
From: pgsql-novice-owner@postgresql.org
[mailto:pgsql-novice-owner@postgresql.org] On Behalf Of bakkiya
Sent: Friday, November 19, 2010 7:17 AM
To: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Could not open relation with OID (table partitioning
issue?)


Hi.
We are also facing similar problem with postgresql partition tables. We are
querying 2 days old partition table and at the same time we are dropping 2
months old partition.Sometimes our queries are getting failed with "ERROR:
could not open relation with OID 1761740". We have set the constraint
exclusion to on. Can you please help us on resolving it?

Regards,
Bakki
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Could-not-open-relation-with-OID-tab
le-partitioning-issue-tp3265708p3272358.html
Sent from the PostgreSQL - novice mailing list archive at Nabble.com.

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice


Re: Could not open relation with OID (table partitioning issue?)

From
bakkiya
Date:
We are facing this issue with 8.3.8 PG version and we don't use cursors. Is
it a bug with PG, if so can you please provide the bug details.

Thanks for your assistance.

Regards,
Bakki
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Could-not-open-relation-with-OID-table-partitioning-issue-tp3265708p3273642.html
Sent from the PostgreSQL - novice mailing list archive at Nabble.com.

Re: Could not open relation with OID (table partitioning issue?)

From
"mark"
Date:
I don't know if this is a bug and if so, if it fixed in the most current
patch. (I don't see anything in the release notes that make think so though
however).

I have yet to be able to create a small self contained case to reproduce for
the bugs mailing list so someone can look into it.



-----Original Message-----
From: pgsql-novice-owner@postgresql.org
[mailto:pgsql-novice-owner@postgresql.org] On Behalf Of bakkiya
Sent: Saturday, November 20, 2010 7:22 AM
To: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Could not open relation with OID (table partitioning
issue?)


We are facing this issue with 8.3.8 PG version and we don't use cursors. Is
it a bug with PG, if so can you please provide the bug details.

Thanks for your assistance.

Regards,
Bakki
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Could-not-open-relation-with-OID-tab
le-partitioning-issue-tp3265708p3273642.html
Sent from the PostgreSQL - novice mailing list archive at Nabble.com.

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice


Re: Could not open relation with OID (table partitioning issue?)

From
bakkiya raj
Date:
I think this is same as http://postgresql.1045698.n5.nabble.com/Is-this-the-expected-behaviour-for-DDL-query-execution-td2123287.html.

Regards,
Bakki

On Mon, Nov 22, 2010 at 1:03 PM, mark <dvlhntr@gmail.com> wrote:
I don't know if this is a bug and if so, if it fixed in the most current
patch. (I don't see anything in the release notes that make think so though
however).

I have yet to be able to create a small self contained case to reproduce for
the bugs mailing list so someone can look into it.



-----Original Message-----
From: pgsql-novice-owner@postgresql.org
[mailto:pgsql-novice-owner@postgresql.org] On Behalf Of bakkiya
Sent: Saturday, November 20, 2010 7:22 AM
To: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Could not open relation with OID (table partitioning
issue?)


We are facing this issue with 8.3.8 PG version and we don't use cursors. Is
it a bug with PG, if so can you please provide the bug details.

Thanks for your assistance.
le-partitioning-issue-tp3265708p3273642.html
Sent from the PostgreSQL - novice mailing list archive at Nabble.com.

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice


Re: Could not open relation with OID (table partitioning issue?)

From
bakkiya
Date:
Here is the detailed description of the problem:
select version()
"PostgreSQL 8.3.8 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20070115 (prerelease) (SUSE Linux)"

Partition Creation Command:
 CREATE TABLE events_p_20100813100000
(
-- Inherited from table events_p_20100813100000:  evt_id uuid NOT NULL,
-- Inherited from table events_p_20100813100000:  evt_time timestamp with
time zone NOT NULL,
  CONSTRAINT events_p_20100813100000_dc CHECK (evt_time > '2010-08-12
10:43:51.901978+05:30'::timestamp with time zone AND evt_time <= '2010-08-13
10:43:51.901978+05:30'::timestamp with time zone)

)
INHERITS (events)
WITH (
  OIDS=FALSE
);

Drop command:
DROP TABLE events_p_20100813100000;

Select query:
SELECT
    *
FROM
    events
WHERE

    AND (events.evt_time >= '2010-10-11 00:00:00'
    AND events.evt_time <= '2010-10-11 23:59:00')

This table has 330 columns, though I have provided only 2 columns
here.Please let us know, if you need any more details:
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Could-not-open-relation-with-OID-table-partitioning-issue-tp3265708p3274944.html
Sent from the PostgreSQL - novice mailing list archive at Nabble.com.

Re: Could not open relation with OID (table partitioning issue?)

From
Tom Lane
Date:
"mark" <dvlhntr@gmail.com> writes:
> I don't know if this is a bug and if so, if it fixed in the most current
> patch. (I don't see anything in the release notes that make think so though
> however).

At least one reason for this type of problem was fixed in 8.4.

commit d4a363cdf2b426bbf6c401543b8286ad86ca9bd5
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Tue May 12 03:11:02 2009 +0000

    Modify find_inheritance_children() and find_all_inheritors() to add the
    ability to lock relations as they scan pg_inherits, and to ignore any
    relations that have disappeared by the time we get lock on them.  This
    makes uses of these functions safe against concurrent DROP operations
    on child tables: we will effectively ignore any just-dropped child,
    rather than possibly throwing an error as in recent bug report from
    Thomas Johansson (and similar past complaints).  The behavior should
    not change otherwise, since the code was acquiring those same locks
    anyway, just a little bit later.

    An exception is LockTableCommand(), which is still behaving unsafely;
    but that seems to require some more discussion before we change it.

            regards, tom lane

Re: Could not open relation with OID (table partitioning issue?)

From
"mark"
Date:


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, November 22, 2010 8:10 AM
To: mark
Cc: 'bakkiya'; pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Could not open relation with OID (table partitioning
issue?)

>"mark" <dvlhntr@gmail.com> writes:
>> I don't know if this is a bug and if so, if it fixed in the most current
>> patch. (I don't see anything in the release notes that make think so
though
>> however).

>At least one reason for this type of problem was fixed in 8.4.

>commit d4a363cdf2b426bbf6c401543b8286ad86ca9bd5
>Author: Tom Lane <tgl@sss.pgh.pa.us>
>Date:   Tue May 12 03:11:02 2009 +0000

 >   Modify find_inheritance_children() and find_all_inheritors() to add the
 >   ability to lock relations as they scan pg_inherits, and to ignore any
 >   relations that have disappeared by the time we get lock on them.  This
 >   makes uses of these functions safe against concurrent DROP operations
 >   on child tables: we will effectively ignore any just-dropped child,
 >   rather than possibly throwing an error as in recent bug report from
 >   Thomas Johansson (and similar past complaints).  The behavior should
 >   not change otherwise, since the code was acquiring those same locks
 >   anyway, just a little bit later.

 >   An exception is LockTableCommand(), which is still behaving unsafely;
 >   but that seems to require some more discussion before we change it.

>            regards, tom lane




So is my adding a access exclusive lock to the parent table when dropping
children a bad thing (tm), in terms of introducing an error, since it uses
the locktablecommand()?

Still riding 8.3.7 here but wanted to keep this in mind for 9.0.1 and higher
when I migrate.

Thanks

-Mark



Re: Could not open relation with OID (table partitioning issue?)

From
Tom Lane
Date:
"mark" <dvlhntr@gmail.com> writes:
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>>> An exception is LockTableCommand(), which is still behaving unsafely;
>>> but that seems to require some more discussion before we change it.

> So is my adding a access exclusive lock to the parent table when dropping
> children a bad thing (tm), in terms of introducing an error, since it uses
> the locktablecommand()?

No; in 8.3 it's all right.  That comment was about the then
development-tip behavior of LOCK TABLE in 8.4, which would try to
recurse to child tables.  8.3 doesn't do that, it just locks exactly the
table you specify, so it won't fail unless you're dropping and
recreating the parent too...

            regards, tom lane

Re: Could not open relation with OID (table partitioning issue?)

From
bakkiya
Date:
This is different from the actual error, just wanted to post about this also.
When queries are running on partition table, EXPLAIN statements are going to
waiting state.
For ex:
I am running
SELECT
    events.dp AS target_port,
    count(events.dp) as hits
FROM
    events
WHERE
    events.rid02 = 23243
    AND events.sip = 455545
    AND events.dp IS NOT NULL
    AND (events.evt_time >= '2010-09-01 16:00:00')
    AND events.evt_time <= '2010-09-01 16:05:00')
GROUP BY
    events.dp
ORDER BY
    2 DESC
LIMIT 10

And in a different session I am running
EXPLAIN <same sql statement mentioned above>
THis EXPLAIN statement is in waiting state. Is this expected behaviour?

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Could-not-open-relation-with-OID-table-partitioning-issue-tp3265708p3277881.html
Sent from the PostgreSQL - novice mailing list archive at Nabble.com.