Re: syslog performance when logging big statements - Mailing list pgsql-performance
From | Tatsuo Ishii |
---|---|
Subject | Re: syslog performance when logging big statements |
Date | |
Msg-id | 20080709.103134.88014194.t-ishii@sraoss.co.jp Whole thread Raw |
In response to | Re: syslog performance when logging big statements (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: syslog performance when logging big statements
|
List | pgsql-performance |
> Jeff <threshar@threshar.is-a-geek.com> writes: > > On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote: > >> File sizes of about 3M result in actual logging output of ~ 10Mb. > >> In this case, the INSERT *needs* 20 minutes to return. This is > >> because the logging through syslog seems to severely slow the system. > >> If instead, i use stderr, even with logging_collector=on, the same > >> statement needs 15 seconds to return. > > > In syslog.conf is the destination for PG marked with a "-"? (ie -/var/ > > log/pg.log) which tells syslog to not sync after each line logged. > > That could explain a large chunk of the difference in time. > > I experimented with this a bit here. There definitely is an O(N^2) > penalty from the repeated strchr() calls, but it doesn't really start > to hurt till 1MB or so statement length. Even with that patched, > syslog logging pretty much sucks performance-wise. Here are the numbers > I got on a Fedora 8 workstation, testing the time to log a statement of > the form SELECT length('123456...lots of data, no newlines...7890'); > > statement length 1MB 10MB > > CVS HEAD 2523ms 215588ms > + patch to fix repeated strchr 529ms 36734ms > after turning off syslogd's fsync 569ms 5881ms > PG_SYSLOG_LIMIT 1024, fsync on 216ms 2532ms > PG_SYSLOG_LIMIT 1024, no fsync 242ms 2692ms > For comparison purposes: > logging statements to stderr 155ms 2042ms > execute statement without logging 42ms 520ms > > This machine is running a cheap IDE drive that caches writes, so > the lack of difference between fsync off and fsync on is not too > surprising --- on a machine with server-grade drives there'd be > a lot more difference. (The fact that there is a difference in > the 10MB case probably reflects filling the drive's write cache.) > > On my old HPUX machine, where fsync really works (and the syslogd > doesn't seem to allow turning it off), the 1MB case takes > 195957ms with the strchr patch, 22922ms at PG_SYSLOG_LIMIT=1024. > > So there's a fairly clear case to be made for fixing the repeated > strchr, but I also think that there's a case for jacking up > PG_SYSLOG_LIMIT. As far as I can tell the current value of 128 > was chosen *very* conservatively without thought for performance: > http://archives.postgresql.org/pgsql-hackers/2000-05/msg01242.php > > At the time we were looking at evidence that the then-current > Linux syslogd got tummyache with messages over about 1KB: > http://archives.postgresql.org/pgsql-hackers/2000-05/msg00880.php > > Some experimentation with the machines I have handy now says that > > Fedora 8: truncates messages at 2KB (including syslog's header) > HPUX 10.20 (ancient): ditto > Mac OS X 10.5.3: drops messages if longer than about 1900 bytes > > So it appears to me that setting PG_SYSLOG_LIMIT = 1024 would be > perfectly safe on current systems (and probably old ones too), > and would give at least a factor of two speedup for logging long > strings --- more like a factor of 8 if syslogd is fsync'ing. > > Comments? Anyone know of systems where this is too high? > Perhaps we should make that change only in HEAD, not in the > back branches, or crank it up only to 512 in the back branches? I'm a little bit worried about cranking up PG_SYSLOG_LIMIT in the back branches. Cranking it up will definitely change syslog messages text style and might confuse syslog handling scripts(I have no evince that such scripts exist though). So I suggest to change PG_SYSLOG_LIMIT only in CVS HEAD. -- Tatsuo Ishii SRA OSS, Inc. Japan
pgsql-performance by date: