Re: error: could not find pg_class tuple for index 2662 - Mailing list pgsql-hackers

From daveg
Subject Re: error: could not find pg_class tuple for index 2662
Date
Msg-id 20110729063130.GD15578@sonic.net
Whole thread Raw
In response to Re: error: could not find pg_class tuple for index 2662  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: error: could not find pg_class tuple for index 2662
Re: error: could not find pg_class tuple for index 2662
List pgsql-hackers
On Thu, Jul 28, 2011 at 07:45:01PM -0400, Robert Haas wrote:
> On Thu, Jul 28, 2011 at 5:46 PM, daveg <daveg@sonic.net> wrote:
> > On Thu, Jul 28, 2011 at 09:46:41AM -0400, Robert Haas wrote:
> >> On Wed, Jul 27, 2011 at 8:28 PM, daveg <daveg@sonic.net> wrote:
> >> > My client has been seeing regular instances of the following sort of problem:
> >> On what version of PostgreSQL?
> >
> > 9.0.4.
> >
> > I previously said:
> >> > This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
> >> > identical systems still running 8.4.8 that do not have this issue, so I'm
> >> > assuming it is related to the vacuum full work done for 9.0. Oddly, we don't
> >> > see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
> >> > so it may be timing related.
> 
> Ah, OK, sorry.  Well, in 9.0, VACUUM FULL is basically CLUSTER, which
> means that a REINDEX is happening as part of the same operation.  In
> 9.0, there's no point in doing VACUUM FULL immediately followed by
> REINDEX.  My guess is that this is happening either right around the
> time the VACUUM FULL commits or right around the time the REINDEX
> commits.  It'd be helpful to know which, if you can figure it out.

I'll update my vacuum script to skip reindexes after vacuum full for 9.0
servers and see if that makes the problem go away. Thanks for reminding
me that they are not needed. However, I suspect it is the vacuum, not the
reindex causing the problem. I'll update when I know.

> If there's not a hardware problem causing those read errors, maybe a
> backend is somehow ending up with a stale or invalid relcache entry.
> I'm not sure exactly how that could be happening, though...

It does not appear to be a hardware problem. I also suspect it is a stale
relcache.

-dg
-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: sinval synchronization considered harmful
Next
From: Shigeru Hanada
Date:
Subject: per-column FDW options, v5