Re: performance for high-volume log insertion - Mailing list pgsql-performance

From Kenneth Marshall
Subject Re: performance for high-volume log insertion
Date
Msg-id 20090421183422.GI18845@it.is.rice.edu
Whole thread Raw
In response to Re: performance for high-volume log insertion  (david@lang.hm)
List pgsql-performance
On Tue, Apr 21, 2009 at 11:09:18AM -0700, david@lang.hm wrote:
> On Tue, 21 Apr 2009, Greg Smith wrote:
>
>> On Mon, 20 Apr 2009, david@lang.hm wrote:
>>
>>> while I fully understand the 'benchmark your situation' need, this isn't
>>> that simple.  in this case we are trying to decide what API/interface to
>>> use in a infrastructure tool that will be distributed in common distros
>>> (it's now the default syslog package of debian and fedora), there are so
>>> many variables in hardware, software, and load that trying to benchmark
>>> it becomes effectivly impossible.
>>
>> From your later comments, you're wandering a bit outside of what you were
>> asking about here.  Benchmarking the *query* side of things can be
>> extremely complicated.  You have to worry about memory allocation, cold
>> vs. warm cache, scale of database relative to RAM, etc.
>>
>> You were asking specifically about *insert* performance, which isn't
>> nearly as complicated.  There are basically three setups:
>>
>> 1) Disk/controller has a proper write cache.  Writes and fsync will be
>> fast. You can insert a few thousand individual transactions per second.
>>
>> 2) Disk/controller has a "lying" write cache.  Writes and fsync will be
>> fast, but it's not safe for database use.  But since (1) is expensive and
>> this one you can get for free jut by using a regular SATA drive with its
>> write cache enabled, you can use this case as a proxy for approximately
>> how (1) would act.  You'll still get a few thousand transactions per
>> second, sustained writes may slow down relative to (1) if you insert
>> enough that you hit a checkpoint (triggering lots of random I/O).
>>
>> 3) All write caches have been disabled because they were not
>> battery-backed. This is the case if you have a regular SATA drive and you
>> disable its write cache because you care about write durability.  You'll
>> get a bit less than RPM/60 writes/second, so <120 inserts/second with a
>> typical 7200RPM drive. Here batching multiple INSERTs together is critical
>> to get any sort of reasonable performance.
>
> in case #1 would you expect to get significant gains from batching? doesn't
> it suffer from problems similar to #2 when checkpoints hit?
>
Even with a disk controller with a proper write cache, the latency for
single-insert-at-a-time will keep the number of updates to the low
thousands per second (on the controllers I have used). If you can batch
them, it would not be unreasonable to increase performance by an order
of magnitude or more. At the high end, other issues like CPU usage can
restrict performance.

Ken
>> In (3), I'd expect that trivia like INSERT vs. COPY and COPY BINARY vs.
>> COPY TEXT would be overwhelmed by the overhead of the commit itself.
>> Therefore you probably want to test with case (2) instead, as it doesn't
>> require any additional hardware but has similar performance to a
>> production-worthy (1). All of the other things you're worried about really
>> don't matter here; you can get an approximate measure of what the
>> performance of the various INSERT/COPY schemes are that is somewhat
>> platform dependant, but the results should be good enough to give you some
>> rule of thumb suggestions for whether optimizations are significant enough
>> to justify the coding effort to implement them or not.
>
> I'll see about setting up a test in the next day or so. should I be able to
> script this through psql? or do I need to write a C program to test this?
>
>> I'm not sure whether you're familiar with all the fsync trivia here.  In
>> normal syslog use, there's an fsync call after every write.  You can
>> disable that by placing a "-" before the file name in /etc/syslog.conf The
>> thing that is going to make database-based writes very different is that
>> syslog's fsync'd writes are unlikely to leave you in a bad state if the
>> drive lies about them, while database writes can.  So someone using syslog
>> on a standard SATA drive isn't getting the write guarantee they think they
>> are, but the downside on a crash is minimal.  If you've got a high-volume
>> syslog environment (>100 lines/second), you can't support those as
>> individual database writes unless you've got a battery-backed write
>> controller.  A regular disk just can't process genuine fsync calls any
>> faster than that.  A serious syslog deployment that turns fsync on and
>> expects it to really do its thing is already exposed to this issue though.
>> I think it may be a the case that a lot of people think they have durable
>> writes in their configuration but really don't.
>
> rsyslog is a little different, instead of just input -> disk it does input
> -> queue -> output (where output can be many things, including disk or
> database)
>
> it's default is to use memory-based queues (and no fsync), but has config
> options to do disk based queues with a fsync after each update
>
> David Lang
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

pgsql-performance by date:

Previous
From: "John L. Clark"
Date:
Subject: Re: WHERE condition not being pushed down to union parts
Next
From: david@lang.hm
Date:
Subject: Re: performance for high-volume log insertion