Thread: syslog performance when logging big statements

syslog performance when logging big statements

From

Achilleas Mantzios

Date:

08 July 2008, 09:27:48

Hi i have experienced really bad performance on both FreeBSD and linux, with syslog,
when logging statements involving bytea of size ~ 10 Mb.
Consider this scenario:
postgres@dynacom=# \d marinerpapers_atts
                                            Table "public.marinerpapers_atts"
   Column    |           Type           |                                   Modifiers


-------------+--------------------------+--------------------------------------------------------------------------------
 id          | integer                  | not null default
nextval(('public.marinerpapers_atts_id_seq'::text)::regclass)
 marinerid   | integer                  | not null
 filename    | text                     | not null
 mimetype    | character varying(50)    | not null
 datecreated | timestamp with time zone | not null default now()
 docsrc      | bytea                    | not null
Indexes:
    "marinerpapers_atts_pkey" PRIMARY KEY, btree (id)
    "marinerpapers_atts_ukey" UNIQUE, btree (marinerid, filename)
    "marinerpapers_atts_marinerid" btree (marinerid)
Foreign-key constraints:
    "$1" FOREIGN KEY (marinerid) REFERENCES mariner(id) ON DELETE CASCADE

The way the insert is done is like
INSERT INTO marinerpapers_atts(marinerid,filename,mimetype,docsrc) VALUES(1,'foo.pdf','aplication/pdf','%PDF-1.3\\0124
0o....%%EOF\\012'); 

When someone tries to insert a row in the above table which results to an error (because e.g. violates the
"marinerpapers_atts_ukey"  constraint), the whole statement is logged to the logging system.

File sizes of about 3M result in actual logging output of ~ 10Mb.
In this case, the INSERT *needs* 20 minutes to return. This is because the logging through syslog seems to severely
slowthe system. 
If instead, i use stderr, even with logging_collector=on, the same statement needs 15 seconds to return.

I am using syslog since like the stone age, and i would like to stick with it, however this morning i was caught by
thisbad performance 
and i am planning moving to stderr + logging_collector.

P.S.
Is there a way to better tune pgsql/syslog in order to work more efficiently in cases like that?
I know it is a corner case, however i thought i should post my experiences.
Thanx

--
Achilleas Mantzios

Re: syslog performance when logging big statements

From

Tom Lane

Date:

08 July 2008, 11:35:29

Achilleas Mantzios <achill@matrix.gatewaynet.com> writes:
> In this case, the INSERT *needs* 20 minutes to return. This is because the logging through syslog seems to severely
slowthe system. 
> If instead, i use stderr, even with logging_collector=on, the same statement needs 15 seconds to return.

Hmm.  There's a function in elog.c that breaks log messages into chunks
for syslog.  I don't think anyone's ever looked hard at its performance
--- maybe there's an O(N^2) behavior?

            regards, tom lane

Re: syslog performance when logging big statements

From

Achilleas Mantzios

Date:

08 July 2008, 12:22:06

Στις Tuesday 08 July 2008 17:35:16 ο/η Tom Lane έγραψε:
> Achilleas Mantzios <achill@matrix.gatewaynet.com> writes:
> > In this case, the INSERT *needs* 20 minutes to return. This is because the logging through syslog seems to severely
slowthe system. 
> > If instead, i use stderr, even with logging_collector=on, the same statement needs 15 seconds to return.
>
> Hmm.  There's a function in elog.c that breaks log messages into chunks
> for syslog.  I don't think anyone's ever looked hard at its performance
> --- maybe there's an O(N^2) behavior?
>
>             regards, tom lane
>

Thanx,
i changed PG_SYSLOG_LIMIT in elog.c:1269 from 128 to 1048576
#ifndef PG_SYSLOG_LIMIT
/* #define PG_SYSLOG_LIMIT 128 */
#define PG_SYSLOG_LIMIT 1048576
#endif

and i got super fast stderr performance. :)

However, i noticed a certain amount of data in the log is lost.
Didnt dig much to the details tho.

--
Achilleas Mantzios

Re: syslog performance when logging big statements

From

Tom Lane

Date:

08 July 2008, 15:34:15

Achilleas Mantzios <achill@matrix.gatewaynet.com> writes:
> Στις Tuesday 08 July 2008 17:35:16 ο/η Tom Lane έγραψε:
>> Hmm.  There's a function in elog.c that breaks log messages into chunks
>> for syslog.  I don't think anyone's ever looked hard at its performance
>> --- maybe there's an O(N^2) behavior?

> Thanx,
> i changed PG_SYSLOG_LIMIT in elog.c:1269 from 128 to 1048576
> and i got super fast stderr performance. :)

Doesn't seem like a very good solution given its impact on the stack
depth right there.

Looking at the code, the only bit that looks like repeated work are the
repeated calls to strchr(), which would not be an issue in the "typical"
case where the very long message contains reasonably frequent newlines.
Am I right in guessing that your problematic statement contained
megabytes worth of text with nary a newline?

If so, we can certainly fix it by arranging to remember the last
strchr() result across loop iterations, but I'd like to confirm the
theory before doing that work.

            regards, tom lane

Re: syslog performance when logging big statements

From

Jeff

Date:

08 July 2008, 17:01:07

On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote:

> File sizes of about 3M result in actual logging output of ~ 10Mb.
> In this case, the INSERT *needs* 20 minutes to return. This is
> because the logging through syslog seems to severely slow the system.
> If instead, i use stderr, even with logging_collector=on, the same
> statement needs 15 seconds to return.
>

In syslog.conf is the destination for PG marked with a "-"? (ie -/var/
log/pg.log) which tells syslog to not sync after each line logged.
That could explain a large chunk of the difference in time.

--
Jeff Trout <jeff@jefftrout.com>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/

Re: syslog performance when logging big statements

From

Tom Lane

Date:

08 July 2008, 18:39:32

Jeff <threshar@threshar.is-a-geek.com> writes:
> On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote:
>> File sizes of about 3M result in actual logging output of ~ 10Mb.
>> In this case, the INSERT *needs* 20 minutes to return. This is
>> because the logging through syslog seems to severely slow the system.
>> If instead, i use stderr, even with logging_collector=on, the same
>> statement needs 15 seconds to return.

> In syslog.conf is the destination for PG marked with a "-"? (ie -/var/
> log/pg.log) which tells syslog to not sync after each line logged.
> That could explain a large chunk of the difference in time.

I experimented with this a bit here.  There definitely is an O(N^2)
penalty from the repeated strchr() calls, but it doesn't really start
to hurt till 1MB or so statement length.  Even with that patched,
syslog logging pretty much sucks performance-wise.  Here are the numbers
I got on a Fedora 8 workstation, testing the time to log a statement of
the form SELECT length('123456...lots of data, no newlines...7890');

statement length            1MB        10MB

CVS HEAD                2523ms        215588ms
+ patch to fix repeated strchr         529ms         36734ms
after turning off syslogd's fsync     569ms          5881ms
PG_SYSLOG_LIMIT 1024, fsync on         216ms          2532ms
PG_SYSLOG_LIMIT 1024, no fsync         242ms          2692ms
For comparison purposes:
logging statements to stderr         155ms          2042ms
execute statement without logging      42ms           520ms

This machine is running a cheap IDE drive that caches writes, so
the lack of difference between fsync off and fsync on is not too
surprising --- on a machine with server-grade drives there'd be
a lot more difference.  (The fact that there is a difference in
the 10MB case probably reflects filling the drive's write cache.)

On my old HPUX machine, where fsync really works (and the syslogd
doesn't seem to allow turning it off), the 1MB case takes
195957ms with the strchr patch, 22922ms at PG_SYSLOG_LIMIT=1024.

So there's a fairly clear case to be made for fixing the repeated
strchr, but I also think that there's a case for jacking up
PG_SYSLOG_LIMIT.  As far as I can tell the current value of 128
was chosen *very* conservatively without thought for performance:
http://archives.postgresql.org/pgsql-hackers/2000-05/msg01242.php

At the time we were looking at evidence that the then-current
Linux syslogd got tummyache with messages over about 1KB:
http://archives.postgresql.org/pgsql-hackers/2000-05/msg00880.php

Some experimentation with the machines I have handy now says that

Fedora 8:        truncates messages at 2KB (including syslog's header)
HPUX 10.20 (ancient):    ditto
Mac OS X 10.5.3:    drops messages if longer than about 1900 bytes

So it appears to me that setting PG_SYSLOG_LIMIT = 1024 would be
perfectly safe on current systems (and probably old ones too),
and would give at least a factor of two speedup for logging long
strings --- more like a factor of 8 if syslogd is fsync'ing.

Comments?  Anyone know of systems where this is too high?
Perhaps we should make that change only in HEAD, not in the
back branches, or crank it up only to 512 in the back branches?

            regards, tom lane

Re: syslog performance when logging big statements

From

david@lang.hm

Date:

08 July 2008, 21:48:07

On Tue, 8 Jul 2008, Tom Lane wrote:

> Jeff <threshar@threshar.is-a-geek.com> writes:
>> On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote:
>>> File sizes of about 3M result in actual logging output of ~ 10Mb.
>>> In this case, the INSERT *needs* 20 minutes to return. This is
>>> because the logging through syslog seems to severely slow the system.
>>> If instead, i use stderr, even with logging_collector=on, the same
>>> statement needs 15 seconds to return.
>
>> In syslog.conf is the destination for PG marked with a "-"? (ie -/var/
>> log/pg.log) which tells syslog to not sync after each line logged.
>> That could explain a large chunk of the difference in time.
>
> I experimented with this a bit here.  There definitely is an O(N^2)
> penalty from the repeated strchr() calls, but it doesn't really start
> to hurt till 1MB or so statement length.  Even with that patched,
> syslog logging pretty much sucks performance-wise.  Here are the numbers
> I got on a Fedora 8 workstation, testing the time to log a statement of
> the form SELECT length('123456...lots of data, no newlines...7890');
>
> statement length            1MB        10MB
>
> CVS HEAD                2523ms        215588ms
> + patch to fix repeated strchr         529ms         36734ms
> after turning off syslogd's fsync     569ms          5881ms
> PG_SYSLOG_LIMIT 1024, fsync on         216ms          2532ms
> PG_SYSLOG_LIMIT 1024, no fsync         242ms          2692ms
> For comparison purposes:
> logging statements to stderr         155ms          2042ms
> execute statement without logging      42ms           520ms
>
> This machine is running a cheap IDE drive that caches writes, so
> the lack of difference between fsync off and fsync on is not too
> surprising --- on a machine with server-grade drives there'd be
> a lot more difference.  (The fact that there is a difference in
> the 10MB case probably reflects filling the drive's write cache.)
>
> On my old HPUX machine, where fsync really works (and the syslogd
> doesn't seem to allow turning it off), the 1MB case takes
> 195957ms with the strchr patch, 22922ms at PG_SYSLOG_LIMIT=1024.
>
> So there's a fairly clear case to be made for fixing the repeated
> strchr, but I also think that there's a case for jacking up
> PG_SYSLOG_LIMIT.  As far as I can tell the current value of 128
> was chosen *very* conservatively without thought for performance:
> http://archives.postgresql.org/pgsql-hackers/2000-05/msg01242.php
>
> At the time we were looking at evidence that the then-current
> Linux syslogd got tummyache with messages over about 1KB:
> http://archives.postgresql.org/pgsql-hackers/2000-05/msg00880.php
>
> Some experimentation with the machines I have handy now says that
>
> Fedora 8:        truncates messages at 2KB (including syslog's header)
> HPUX 10.20 (ancient):    ditto
> Mac OS X 10.5.3:    drops messages if longer than about 1900 bytes
>
> So it appears to me that setting PG_SYSLOG_LIMIT = 1024 would be
> perfectly safe on current systems (and probably old ones too),
> and would give at least a factor of two speedup for logging long
> strings --- more like a factor of 8 if syslogd is fsync'ing.
>
> Comments?  Anyone know of systems where this is too high?
> Perhaps we should make that change only in HEAD, not in the
> back branches, or crank it up only to 512 in the back branches?

with linux ext2/ext3 filesystems I have seen similar problems when the
syslog starts getting large. there are several factors here

1. fsync after each write unless you have "-" in syslog.conf (only
available on linux AFAIK)

2. ext2/ext3 tend to be very inefficiant when doing appends to large
files. each write requires that the syslog daemon seek to the end of the
file (becouse something else may have written to the file in the meantime)
and with the small block sizes and chaining of indirect blocks this can
start to be painful when logfiles get up in to the MB range.

note that you see this same problem when you start to get lots of
files in one directory as well. even if you delete a lot of files the
directory itself is still large and this can cause serious performance
problems.

other filesystems are much less sensitive to file (and directory) sizes.

my suggestion would be to first make sure you are doing async writes to
syslog, and then try putting the logfiles on different filesystems to see
how they differ. personally I use XFS most of the time where I expect lots
of files or large files.

David Lang

Re: syslog performance when logging big statements

From

Tatsuo Ishii

Date:

08 July 2008, 22:31:52

> Jeff <threshar@threshar.is-a-geek.com> writes:
> > On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote:
> >> File sizes of about 3M result in actual logging output of ~ 10Mb.
> >> In this case, the INSERT *needs* 20 minutes to return. This is
> >> because the logging through syslog seems to severely slow the system.
> >> If instead, i use stderr, even with logging_collector=on, the same
> >> statement needs 15 seconds to return.
>
> > In syslog.conf is the destination for PG marked with a "-"? (ie -/var/
> > log/pg.log) which tells syslog to not sync after each line logged.
> > That could explain a large chunk of the difference in time.
>
> I experimented with this a bit here.  There definitely is an O(N^2)
> penalty from the repeated strchr() calls, but it doesn't really start
> to hurt till 1MB or so statement length.  Even with that patched,
> syslog logging pretty much sucks performance-wise.  Here are the numbers
> I got on a Fedora 8 workstation, testing the time to log a statement of
> the form SELECT length('123456...lots of data, no newlines...7890');
>
> statement length            1MB        10MB
>
> CVS HEAD                2523ms        215588ms
> + patch to fix repeated strchr         529ms         36734ms
> after turning off syslogd's fsync     569ms          5881ms
> PG_SYSLOG_LIMIT 1024, fsync on         216ms          2532ms
> PG_SYSLOG_LIMIT 1024, no fsync         242ms          2692ms
> For comparison purposes:
> logging statements to stderr         155ms          2042ms
> execute statement without logging      42ms           520ms
>
> This machine is running a cheap IDE drive that caches writes, so
> the lack of difference between fsync off and fsync on is not too
> surprising --- on a machine with server-grade drives there'd be
> a lot more difference.  (The fact that there is a difference in
> the 10MB case probably reflects filling the drive's write cache.)
>
> On my old HPUX machine, where fsync really works (and the syslogd
> doesn't seem to allow turning it off), the 1MB case takes
> 195957ms with the strchr patch, 22922ms at PG_SYSLOG_LIMIT=1024.
>
> So there's a fairly clear case to be made for fixing the repeated
> strchr, but I also think that there's a case for jacking up
> PG_SYSLOG_LIMIT.  As far as I can tell the current value of 128
> was chosen *very* conservatively without thought for performance:
> http://archives.postgresql.org/pgsql-hackers/2000-05/msg01242.php
>
> At the time we were looking at evidence that the then-current
> Linux syslogd got tummyache with messages over about 1KB:
> http://archives.postgresql.org/pgsql-hackers/2000-05/msg00880.php
>
> Some experimentation with the machines I have handy now says that
>
> Fedora 8:        truncates messages at 2KB (including syslog's header)
> HPUX 10.20 (ancient):    ditto
> Mac OS X 10.5.3:    drops messages if longer than about 1900 bytes
>
> So it appears to me that setting PG_SYSLOG_LIMIT = 1024 would be
> perfectly safe on current systems (and probably old ones too),
> and would give at least a factor of two speedup for logging long
> strings --- more like a factor of 8 if syslogd is fsync'ing.
>
> Comments?  Anyone know of systems where this is too high?
> Perhaps we should make that change only in HEAD, not in the
> back branches, or crank it up only to 512 in the back branches?

I'm a little bit worried about cranking up PG_SYSLOG_LIMIT in the back
branches. Cranking it up will definitely change syslog messages text
style and might confuse syslog handling scripts(I have no evince that
such scripts exist though). So I suggest to change PG_SYSLOG_LIMIT
only in CVS HEAD.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Re: syslog performance when logging big statements

From

Tom Lane

Date:

08 July 2008, 23:09:18

Tatsuo Ishii <ishii@postgresql.org> writes:
> I'm a little bit worried about cranking up PG_SYSLOG_LIMIT in the back
> branches. Cranking it up will definitely change syslog messages text
> style and might confuse syslog handling scripts(I have no evince that
> such scripts exist though). So I suggest to change PG_SYSLOG_LIMIT
> only in CVS HEAD.

Hmm, good point.  It would be an externally visible behavior change,
not just a speedup.

            regards, tom lane

Re: syslog performance when logging big statements

From

"achill@matrix"

Date:

09 July 2008, 09:37:09

Στις Wednesday 09 July 2008 03:47:34 ο/η david@lang.hm έγραψε:
> On Tue, 8 Jul 2008, Tom Lane wrote:
> > Jeff <threshar@threshar.is-a-geek.com> writes:
> >> On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote:
> >>> File sizes of about 3M result in actual logging output of ~ 10Mb.
> >>> In this case, the INSERT *needs* 20 minutes to return. This is
> >>> because the logging through syslog seems to severely slow the system.
> >>> If instead, i use stderr, even with logging_collector=on, the same
> >>> statement needs 15 seconds to return.
> >>
> >> In syslog.conf is the destination for PG marked with a "-"? (ie -/var/
> >> log/pg.log) which tells syslog to not sync after each line logged.
> >> That could explain a large chunk of the difference in time.
> >
> > I experimented with this a bit here.  There definitely is an O(N^2)
> > penalty from the repeated strchr() calls, but it doesn't really start
> > to hurt till 1MB or so statement length.  Even with that patched,
> > syslog logging pretty much sucks performance-wise.  Here are the numbers
> > I got on a Fedora 8 workstation, testing the time to log a statement of
> > the form SELECT length('123456...lots of data, no newlines...7890');
> >
> > statement length            1MB        10MB
> >
> > CVS HEAD                2523ms        215588ms
> > + patch to fix repeated strchr         529ms         36734ms
> > after turning off syslogd's fsync     569ms          5881ms
> > PG_SYSLOG_LIMIT 1024, fsync on         216ms          2532ms
> > PG_SYSLOG_LIMIT 1024, no fsync         242ms          2692ms
> > For comparison purposes:
> > logging statements to stderr         155ms          2042ms
> > execute statement without logging      42ms           520ms
> >
> > This machine is running a cheap IDE drive that caches writes, so
> > the lack of difference between fsync off and fsync on is not too
> > surprising --- on a machine with server-grade drives there'd be
> > a lot more difference.  (The fact that there is a difference in
> > the 10MB case probably reflects filling the drive's write cache.)
> >
> > On my old HPUX machine, where fsync really works (and the syslogd
> > doesn't seem to allow turning it off), the 1MB case takes
> > 195957ms with the strchr patch, 22922ms at PG_SYSLOG_LIMIT=1024.
> >
> > So there's a fairly clear case to be made for fixing the repeated
> > strchr, but I also think that there's a case for jacking up
> > PG_SYSLOG_LIMIT.  As far as I can tell the current value of 128
> > was chosen *very* conservatively without thought for performance:
> > http://archives.postgresql.org/pgsql-hackers/2000-05/msg01242.php
> >
> > At the time we were looking at evidence that the then-current
> > Linux syslogd got tummyache with messages over about 1KB:
> > http://archives.postgresql.org/pgsql-hackers/2000-05/msg00880.php
> >
> > Some experimentation with the machines I have handy now says that
> >
> > Fedora 8:        truncates messages at 2KB (including syslog's header)
> > HPUX 10.20 (ancient):    ditto
> > Mac OS X 10.5.3:    drops messages if longer than about 1900 bytes
> >
> > So it appears to me that setting PG_SYSLOG_LIMIT = 1024 would be
> > perfectly safe on current systems (and probably old ones too),
> > and would give at least a factor of two speedup for logging long
> > strings --- more like a factor of 8 if syslogd is fsync'ing.
> >
> > Comments?  Anyone know of systems where this is too high?
> > Perhaps we should make that change only in HEAD, not in the
> > back branches, or crank it up only to 512 in the back branches?
>
> with linux ext2/ext3 filesystems I have seen similar problems when the
> syslog starts getting large. there are several factors here
>
> 1. fsync after each write unless you have "-" in syslog.conf (only
> available on linux AFAIK)
>
In FreeBSD 7.0 by default it does not fsync (except for kernel messages),
unless the path is prefixed by "-" whereas it syncs.
> 2. ext2/ext3 tend to be very inefficiant when doing appends to large
> files. each write requires that the syslog daemon seek to the end of the
> file (becouse something else may have written to the file in the meantime)
> and with the small block sizes and chaining of indirect blocks this can
> start to be painful when logfiles get up in to the MB range.
>
> note that you see this same problem when you start to get lots of
> files in one directory as well. even if you delete a lot of files the
> directory itself is still large and this can cause serious performance
> problems.
>
> other filesystems are much less sensitive to file (and directory) sizes.
>
> my suggestion would be to first make sure you are doing async writes to
> syslog, and then try putting the logfiles on different filesystems to see
> how they differ. personally I use XFS most of the time where I expect lots
> of files or large files.
>
> David Lang

Re: syslog performance when logging big statements

From

"achill@matrix"

Date:

09 July 2008, 09:51:17

>In FreeBSD 7.0 by default it does not fsync (except for kernel messages),
>unless the path is prefixed by "-" whereas it syncs.
Sorry, scrap the above sentence.
The correct is to say that FreeBSD 7.0 by default it does not fsync(2) (except
for kernel messages), and even in this case of kernel messages, syncing
can be bypassed by the use of "-" prefix.
So in our case of postgresql (local0 facility) it does not sync.

Re: syslog performance when logging big statements

From

"achill@matrix"

Date:

09 July 2008, 09:58:53

Στις Tuesday 08 July 2008 21:34:01 ο/η Tom Lane έγραψε:
> Achilleas Mantzios <achill@matrix.gatewaynet.com> writes:
> > Στις Tuesday 08 July 2008 17:35:16 ο/η Tom Lane έγραψε:
> >> Hmm.  There's a function in elog.c that breaks log messages into chunks
> >> for syslog.  I don't think anyone's ever looked hard at its performance
> >> --- maybe there's an O(N^2) behavior?
> >
> > Thanx,
> > i changed PG_SYSLOG_LIMIT in elog.c:1269 from 128 to 1048576
> > and i got super fast stderr performance. :)
>
> Doesn't seem like a very good solution given its impact on the stack
> depth right there.
>
> Looking at the code, the only bit that looks like repeated work are the
> repeated calls to strchr(), which would not be an issue in the "typical"
> case where the very long message contains reasonably frequent newlines.
> Am I right in guessing that your problematic statement contained
> megabytes worth of text with nary a newline?

Yes it was the input source of a bytea field containing tiff or pdf data,
of size of some 1-3 megabytes, and it did not contain (many) newlines.

>
> If so, we can certainly fix it by arranging to remember the last
> strchr() result across loop iterations, but I'd like to confirm the
> theory before doing that work.
>
>             regards, tom lane