On Mon, Jun 25, 2012 at 10:03 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Mon, Jun 25, 2012 at 9:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Merlin Moncure <mmoncure@gmail.com> writes:
>>> 2012-06-25 09:08:08 CDT [postgres@ysanalysis_hes]: LOG: =A0could not
>>> send data to client: Broken pipe
>>> 2012-06-25 09:08:10 CDT [postgres@ysanalysis_hes]: LOG: =A0unexpected
>>> EOF on client connection
>>> 2012-06-25 09:08:10 CDT [postgres@ysanalysis_hes]: LOG: =A0process 22821
>>> releasing ProcSignal slot 32, but it contains 0
>>> 2012-06-25 09:08:10 CDT [postgres@ysanalysis_hes]: LOG: =A0failed to
>>> find proc 0x7f48617e2ab0 in ProcArray
>>> [and a bit later]
>>> 2012-06-25 09:08:24 CDT [postgres@ysanalysis_hes]: FATAL: =A0latch alre=
ady owned
>>
>> I think what we're looking at here is a screw-up in the process shutdown
>> sequence. =A0Perhaps caused by bad recovery from an attempt to send an
>> error message to the already-disconnected client; but that's just
>> speculation, and it's hard to see how to get more info without a core
>> dump.
>>
>> I wonder whether we shouldn't promote some or all of these three error
>> cases to PANIC, as they certainly suggest shared-memory corruption.
>> And if it did panic, we could hope to get a core dump for debugging
>> purposes.
>
> Ok, I'll look into reproducing the crash conditions. =A0Unfortunately
> this is a critical server and it crashed during a time sensitive
> process. I can schedule a maintenance window though but it will have
> to wait a bit.
>
> merlin
I have some good news: this was reproduce and i I believe it to be
operator invoked:
2012-06-26 09:12:19 CDT [postgres@ysanalysis_hes]: ERROR: index
"idx_lease_expiremonth2" does not exist
2012-06-26 09:12:19 CDT [postgres@ysanalysis_hes]: STATEMENT: DROP
INDEX idx_Lease_ExpireMonth2;
2012-06-26 09:15:10 CDT [rms@ysanalysis]: LOG: unexpected EOF on
client connection
2012-06-26 09:15:10 CDT [rms@ysanalysis]: LOG: process 10340
releasing ProcSignal slot 5, but it contains 0
2012-06-26 09:15:10 CDT [rms@ysanalysis]: LOG: failed to find proc
0x7f48617e6310 in ProcArray
2012-06-26 09:16:48 CDT [rms@ysanalysis]: FATAL: latch already owned
2012-06-26 09:16:48 CDT [@]: LOG: server process (PID 10928) exited
with exit code 1
2012-06-26 09:16:48 CDT [@]: LOG: terminating any other active server proc=
esses
2012-06-26 09:16:48 CDT [postgres@postgres]: WARNING: terminating
connection because of crash of another server process
...investigating...
merlin