Thread: vacuum slowed by syslogd

vacuum slowed by syslogd

From

Michael Adler

Date:

13 January 2004, 23:11:01

On many occasions, I've noticed that some PostgreSQL activity takes far longer than it previously did and that
disablingsyslogd addresses the symptoms.

Most recently, it took 20-60 seconds to VACUUM a small, heavily updated table, while it used to take a fraction of a
second.I found syslog entries like these:

13:19:53 --Relation sometable--
13:20:03 Removed 2 tuples in 1 pages.
13:20:23 ^ICPU 0.00s/0.00u sec elapsed 0.00 sec.
13:20:54 Pages 1: Changed 1, Empty 0; Tup 4: Vac 2, Keep 0, UnUsed 13.
13:20:54 ^ITotal CPU 0.00s/0.00u sec elapsed 60.12 sec.

It took almost-exactly 60 seconds, but virtually no CPU time was used (and no disk IO). Many similar examples have
"real"times that are near perfect multiples of 10 seconds (e.g. 50.09, 40.07). This is not every single VACUUM, but it
isfrequent.

The problem disappears when syslogd is stopped or when PostgreSQL disables syslog usage. This is very consistent and I
canreproduce the problem in some installations by toggling these factors on and off.

The other way to toggle the problem on/off is to disable syslogd's udp socket (though this is a feature we'd still like
touse).

This evidence normally indicates a name resolution issue, but I'm not sure how to test for that beyond using "hostname
-v-i".

This has happened at several installations. We use both 7.2.1 (Debian stable) and 7.3.2 (Debian testing).

I would guess that this is not the result of a PostgreSQL deficiency, but it is most sharply affected by it. Perhaps
someonecan offer insight?

Thanks,
Mike Adler

Re: vacuum slowed by syslogd

From

Tom Lane

Date:

14 January 2004, 00:36:55

Michael Adler <adler@pobox.com> writes:
> On many occasions, I've noticed that some PostgreSQL activity takes
> far longer than it previously did and that disabling syslogd addresses
> the symptoms.
> ...
> This evidence normally indicates a name resolution issue, but I'm not
> sure how to test for that beyond using "hostname -v -i".

I looked in Red Hat's bugzilla for similar issues, and found a mention
that syslogd tries to reverse-lookup the address it gets each message
from.  A delay there could act as you describe.  At least in recent RH
releases, you can start syslogd with the "-x" switch to disable this
lookup --- does that exist in Debian, and if so does it help?

I would expect any comparable resolution delays on the client side to
occur only once when a backend is started, not once per message, since
we only openlog() once per backend.  So a delay on the syslogd side
seems a likely theory.

Something else you could do to investigate is to test whether syslogd
can log messages from other clients during these 10-second delays.
If it's hung trying to resolve a name, presumably messages from other
clients would get delayed too.

            regards, tom lane

Re: vacuum slowed by syslogd

From

Michael Adler

Date:

14 January 2004, 17:46:54

On Tue, Jan 13, 2004 at 11:36:19PM -0500, Tom Lane wrote:
> Michael Adler <adler@pobox.com> writes:
> > On many occasions, I've noticed that some PostgreSQL activity takes
> > far longer than it previously did and that disabling syslogd addresses
> > the symptoms.
> > ...
> > This evidence normally indicates a name resolution issue, but I'm not
> > sure how to test for that beyond using "hostname -v -i".
>
> I looked in Red Hat's bugzilla for similar issues, and found a mention
> that syslogd tries to reverse-lookup the address it gets each message
> from.  A delay there could act as you describe.  At least in recent RH
> releases, you can start syslogd with the "-x" switch to disable this
> lookup --- does that exist in Debian, and if so does it help?

There is no such option within Debian or the syslogd from freshmeat. Sounds nice, though.

I was able to track down a name resolution issue. The "nameserver" in /etc/resolv.conf pointed to a nonexistent host.
Thisadded ten second delays to telnetd logins and netstat -a. This seemed to be the heart of the problem, but I'm
gettingsome mixed info from the "field" and I'm still following up.  

Thanks,
Mike

Re: vacuum slowed by syslogd

From

Gaetano Mendola

Date:

14 January 2004, 22:48:38

Michael Adler wrote:

> On many occasions, I've noticed that some PostgreSQL activity takes far longer than it previously did and that
disablingsyslogd addresses the symptoms.  
>
> Most recently, it took 20-60 seconds to VACUUM a small, heavily updated table, while it used to take a fraction of a
second.I found syslog entries like these: 
>
> 13:19:53 --Relation sometable--
> 13:20:03 Removed 2 tuples in 1 pages.
> 13:20:23 ^ICPU 0.00s/0.00u sec elapsed 0.00 sec.
> 13:20:54 Pages 1: Changed 1, Empty 0; Tup 4: Vac 2, Keep 0, UnUsed 13.
> 13:20:54 ^ITotal CPU 0.00s/0.00u sec elapsed 60.12 sec.
>
> It took almost-exactly 60 seconds, but virtually no CPU time was used (and no disk IO). Many similar examples have
"real"times that are near perfect multiples of 10 seconds (e.g. 50.09, 40.07). This is not every single VACUUM, but it
isfrequent.  
>
> The problem disappears when syslogd is stopped or when PostgreSQL disables syslog usage. This is very consistent and
Ican reproduce the problem in some installations by toggling these factors on and off.  
>

Did you try to put a "-" before the file log name ?
Example:

LOCAL0.*                   -/var/log/postgresql.log


Regards
Gaetano Mendola