Re: syslog enabled causes random hangs? - Mailing list pgsql-admin

From Arthur Ward
Subject Re: syslog enabled causes random hangs?
Date
Msg-id 65117.68.62.130.132.1059521610.squirrel@www.dominionsciences.com
Whole thread Raw
In response to Re: syslog enabled causes random hangs?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: syslog enabled causes random hangs?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-admin
> "Arthur Ward" <award@dominionsciences.com> writes:
>> I'm encountering strange hangs in postgresql backends at random
>> moments. They seem to be associated with attempts to issue log entries
>> via syslog. I have run backtraces on the hung backends a few times,
>> and they routinely trace into system libraries where it looks like a
>> stuck syslog call. So far, I haven't had this problem with any other
>> apps, so I'm thinking it's a condition being aggravated by Postgres.
>
> How verbose are your Postgres logging settings?

Defaults. (single-user test machine -- not much activity) On what I was
doing today, it seemed to happen most often with the occurrance of a
checkpoint, which would normally log a message about recycling the
transaction log files.

> On most platforms I've dealt with, syslog drops messages on the floor if
> it starts to get saturated.  It may be that the Linux implementation has
> worse behavior than that under heavy load :-(.  In any case I'd suggest
> filing a bug against syslog.  There's very little we can do about it if
> the syslog library call hangs up.

After rebuilding to get a non-stripped copy of Postgresql, I did a little
searching on my backtraces. Some of the first hits are pointing at a
deadlock condition in glibc, although I'm not sure how or why this is
affecting PostgreSQL running independently of my app and its signal
handler. Maybe some more digging will enlighten me...

> Personally I find it more reliable to pipe the postmaster's stderr to
> some sort of log-rotation program than to depend on syslog.  It seems
> that the Apache folks have found the same, since they include a suitable
> log-rotation filter in their distribution ...

Considering deadlocks in the system libraries sound a bit scary to me, I'm
definitely convinced to change my development machine now... As I
originally wrote, I'd already changed the production machine.


FWIW, since I already went to the trouble (and for the sake of people
searching the archives in the future), here's what I was seeing in the
backtraces after rebuilding this afternoon:

This process:
19763 pts/2    S      0:00 postgres: checkpoint subprocess
Was stuck here:
(gdb) bt
#0  0x402cf077 in semop () from /lib/libc.so.6
(gdb)

And the other:
19720 pts/2    S      0:04 postgres: award Trucking [local] UPDATE
(gdb) bt
#0  0x4021cec6 in sigsuspend () from /lib/libc.so.6
#1  0x424b6218 in __pthread_wait_for_restart_signal ()
   from /lib/libpthread.so.0
#2  0x424b79a0 in __pthread_alt_lock () from /lib/libpthread.so.0
#3  0x424b4c17 in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0x402ca21c in vsyslog () from /lib/libc.so.6
#5  0x402c9d8f in syslog () from /lib/libc.so.6
#6  0x08150a57 in write_syslog ()
(gdb)




pgsql-admin by date:

Previous
From: "Mendola Gaetano"
Date:
Subject: Re: storage calculations
Next
From: "Mendola Gaetano"
Date:
Subject: 7.3.4 RPM ready for RH 7.1 and RH 7.2