Re: syslog performance when logging big statements - Mailing list pgsql-performance

From Tatsuo Ishii
Subject Re: syslog performance when logging big statements
Date
Msg-id 20080709.103134.88014194.t-ishii@sraoss.co.jp
Whole thread Raw
In response to Re: syslog performance when logging big statements  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: syslog performance when logging big statements
List pgsql-performance
> Jeff <threshar@threshar.is-a-geek.com> writes:
> > On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote:
> >> File sizes of about 3M result in actual logging output of ~ 10Mb.
> >> In this case, the INSERT *needs* 20 minutes to return. This is
> >> because the logging through syslog seems to severely slow the system.
> >> If instead, i use stderr, even with logging_collector=on, the same
> >> statement needs 15 seconds to return.
>
> > In syslog.conf is the destination for PG marked with a "-"? (ie -/var/
> > log/pg.log) which tells syslog to not sync after each line logged.
> > That could explain a large chunk of the difference in time.
>
> I experimented with this a bit here.  There definitely is an O(N^2)
> penalty from the repeated strchr() calls, but it doesn't really start
> to hurt till 1MB or so statement length.  Even with that patched,
> syslog logging pretty much sucks performance-wise.  Here are the numbers
> I got on a Fedora 8 workstation, testing the time to log a statement of
> the form SELECT length('123456...lots of data, no newlines...7890');
>
> statement length            1MB        10MB
>
> CVS HEAD                2523ms        215588ms
> + patch to fix repeated strchr         529ms         36734ms
> after turning off syslogd's fsync     569ms          5881ms
> PG_SYSLOG_LIMIT 1024, fsync on         216ms          2532ms
> PG_SYSLOG_LIMIT 1024, no fsync         242ms          2692ms
> For comparison purposes:
> logging statements to stderr         155ms          2042ms
> execute statement without logging      42ms           520ms
>
> This machine is running a cheap IDE drive that caches writes, so
> the lack of difference between fsync off and fsync on is not too
> surprising --- on a machine with server-grade drives there'd be
> a lot more difference.  (The fact that there is a difference in
> the 10MB case probably reflects filling the drive's write cache.)
>
> On my old HPUX machine, where fsync really works (and the syslogd
> doesn't seem to allow turning it off), the 1MB case takes
> 195957ms with the strchr patch, 22922ms at PG_SYSLOG_LIMIT=1024.
>
> So there's a fairly clear case to be made for fixing the repeated
> strchr, but I also think that there's a case for jacking up
> PG_SYSLOG_LIMIT.  As far as I can tell the current value of 128
> was chosen *very* conservatively without thought for performance:
> http://archives.postgresql.org/pgsql-hackers/2000-05/msg01242.php
>
> At the time we were looking at evidence that the then-current
> Linux syslogd got tummyache with messages over about 1KB:
> http://archives.postgresql.org/pgsql-hackers/2000-05/msg00880.php
>
> Some experimentation with the machines I have handy now says that
>
> Fedora 8:        truncates messages at 2KB (including syslog's header)
> HPUX 10.20 (ancient):    ditto
> Mac OS X 10.5.3:    drops messages if longer than about 1900 bytes
>
> So it appears to me that setting PG_SYSLOG_LIMIT = 1024 would be
> perfectly safe on current systems (and probably old ones too),
> and would give at least a factor of two speedup for logging long
> strings --- more like a factor of 8 if syslogd is fsync'ing.
>
> Comments?  Anyone know of systems where this is too high?
> Perhaps we should make that change only in HEAD, not in the
> back branches, or crank it up only to 512 in the back branches?

I'm a little bit worried about cranking up PG_SYSLOG_LIMIT in the back
branches. Cranking it up will definitely change syslog messages text
style and might confuse syslog handling scripts(I have no evince that
such scripts exist though). So I suggest to change PG_SYSLOG_LIMIT
only in CVS HEAD.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

pgsql-performance by date:

Previous
From: david@lang.hm
Date:
Subject: Re: syslog performance when logging big statements
Next
From: Tom Lane
Date:
Subject: Re: syslog performance when logging big statements