Re: Backends stalled in 'startup' state: index corruption - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Backends stalled in 'startup' state: index corruption
Date
Msg-id 2561.1337898305@sss.pgh.pa.us
Whole thread Raw
In response to Re: Backends stalled in 'startup' state: index corruption  (Greg Sabino Mullane <greg@endpoint.com>)
Responses Re: Backends stalled in 'startup' state: index corruption
List pgsql-hackers
Greg Sabino Mullane <greg@endpoint.com> writes:
> Oh, almost forgot: reading your reply to the old thread reminded me of 
> something I saw in one of the straces right as it "woke up" and left 
> the startup state to do some work. Here's a summary:

> 12:18:39 semop(4390981, 0x7fff66c4ec10, 1) = 0
> 12:18:39 semop(4390981, 0x7fff66c4ec10, 1) = 0
> 12:18:39 semop(4390981, 0x7fff66c4ec10, 1) = 0
> (x a gazillion)
> ...
> 12:18:40 brk(0x1c0af000)                = 0x1c0af000
> ...(some more semops)...
> 12:18:40 mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ac062c98000
> ...(handful of semops)...
> 12:18:40 unlink("base/1554846571/pg_internal.init.11803") = -1 ENOENT (No such file or directory)
> 12:18:40 open("base/1554846571/pg_internal.init.11803", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 13
> 12:18:40 fstat(13, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
> 12:18:40 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ac062cd9000
> 12:18:40 write(13, ...
> ...(normalish looking strace output after this)...

Yeah, this is proof that what it was doing is the same as what we saw in
Jeff's backtrace, ie loading up the system catalog relcache entries the
hard way via seqscans on the core catalogs.  So the question to be
answered is why that's suddenly a big performance bottleneck.  It's not
a cheap operation of course (that's why we cache the results ;-)) but
it shouldn't take minutes either.  And, because they are seqscans, it
doesn't seem like messed-up indexes should matter.

The theory I have in mind about Jeff's case is that it was basically an
I/O storm, but it's not clear whether the same explanation works for
your case.  There may be some other contributing factor that we haven't
identified yet.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Per-Database Roles
Next
From: Sergey Koposov
Date:
Subject: Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile