Thread: Configuring PostgreSQL to minimize impact of checkpoints

Configuring PostgreSQL to minimize impact of checkpoints

From
jao@geophile.com
Date:
My company is developing a PostgreSQL application. We're using 7.3.4
but will soon upgrade to 7.4.x. Our OS is RedHat 9.  Our production
machines have 512 MB RAM and IDE disks. So far we've been using
default configuration settings, but I have started to examine
performance and to modify these settings.

Our typical transaction involves 5-10 SELECT, INSERT or UPDATEs,
(usually 1/2 SELECT and the remainder a mixture of INSERT and UPDATE).
There are a few aggregation queries which need to scan an entire
table.  We observed highly uneven performance for the small
transactions. A transaction usually runs in under 100 msec, but we
would see spikes as high as 40,000 msec. These spikes occurred
regularly, every 4-5 minutes, and I speculated that checkpointing
might be the issue.

I created a test case, based on a single table:

        create table test(
            id int not null,
            count int not null,
            filler varchar(200),
            primary key(id))

I loaded a database with 1,000,000 rows, with the filler column always
filled with 200 characters.

I then ran a test in which a random row was selected, and the count
column incremented. Each transaction contained ten such updates. In
this test, I set

    shared_buffers = 2000
    checkpoint_segments = 40
    checkpoint_timeout = 600
    wal_debug = 1

I set checkpoint_segments high because I wanted to see whether the
spikes correlated with checkpoints.

Most transactions completed in under 60 msec. Approximately every 10th
transaction, the time went up to 500-600 msec, (which is puzzling, but
not my major concern). I did see a spike every 10 minutes, in which
transaction time goes up to 5000-8000 msec. The spikes were correlated
with checkpoint activity, occurring slightly before a log entry that
looks like this:

    2004-05-09 16:34:19 LOG:  INSERT @ 2/C2A0F628: prev 2/C2A0F5EC;
    xprev 0/0; xid 0: XLOG - checkpoint: redo 2/C2984D4C; undo 0/0;
    sui 36; xid 1369741; oid 6321782; online

Questions:

1. Can someone provide an overview of checkpoint processing, to help
me understand the performance issues?

2. Is the spike due to the checkpoint process keeping the disk busy?
Or is there some locking involved that blocks my application until the
checkpoint completes?

3. The spikes are quite problematic for us. What can I do to minimize
the impact of checkpointing on my application? I understand how
checkpoint_segments and checkpoint_timeout determine when a checkpoint
occurs; what can I do to lessen the impact of a checkpoint?

4. I understand that a "background writer" is being contemplated for
7.5.  Will that replace or augment the checkpoint process?  Any
comments on how that work will apply to my problem would be
appreciated. I wouldn't mind seeing the average performance,
(without the spikes) go up -- let's say -- 10%, in exchange for
more uniform performance. These spikes are a real problem.

Jack Orenstein

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

Re: Configuring PostgreSQL to minimize impact of checkpoints

From
Bruce Momjian
Date:
jao@geophile.com wrote:
> 4. I understand that a "background writer" is being contemplated for
> 7.5.  Will that replace or augment the checkpoint process?  Any
> comments on how that work will apply to my problem would be
> appreciated. I wouldn't mind seeing the average performance,
> (without the spikes) go up -- let's say -- 10%, in exchange for
> more uniform performance. These spikes are a real problem.

The background writer is designed to address your specific problem.  We
will stil checkpoint, but the spike should be greatly minimized.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Configuring PostgreSQL to minimize impact of checkpoints

From
Jack Orenstein
Date:
Bruce Momjian wrote:
> jao@geophile.com wrote:
>
>>4. I understand that a "background writer" is being contemplated for
>>7.5.  Will that replace or augment the checkpoint process?  Any
>>comments on how that work will apply to my problem would be
>>appreciated. I wouldn't mind seeing the average performance,
>>(without the spikes) go up -- let's say -- 10%, in exchange for
>>more uniform performance. These spikes are a real problem.
>
>
> The background writer is designed to address your specific problem.  We
> will stil checkpoint, but the spike should be greatly minimized.
>

Thanks. Do you know when 7.5 is expected to be released?

Until then, is a workaround known? Also, are the delays I'm seeing out of the ordinary?
I'm looking at one case in which two successive transactions, each updating a handful of
records, take 26 and 18 *seconds* (not msec) to complete. These transactions normally complete
in under 30 msec.

Jack Orenstein


Re: Configuring PostgreSQL to minimize impact of checkpoints

From
Bruce Momjian
Date:
Jack Orenstein wrote:
> Bruce Momjian wrote:
> > jao@geophile.com wrote:
> >
> >>4. I understand that a "background writer" is being contemplated for
> >>7.5.  Will that replace or augment the checkpoint process?  Any
> >>comments on how that work will apply to my problem would be
> >>appreciated. I wouldn't mind seeing the average performance,
> >>(without the spikes) go up -- let's say -- 10%, in exchange for
> >>more uniform performance. These spikes are a real problem.
> >
> >
> > The background writer is designed to address your specific problem.  We
> > will stil checkpoint, but the spike should be greatly minimized.
> >
>
> Thanks. Do you know when 7.5 is expected to be released?

3-6 months.

> Until then, is a workaround known? Also, are the delays I'm seeing out of the ordinary?
> I'm looking at one case in which two successive transactions, each updating a handful of
> records, take 26 and 18 *seconds* (not msec) to complete. These transactions normally complete
> in under 30 msec.

Wow.  Others might know the answer to that.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Configuring PostgreSQL to minimize impact of checkpoints

From
Tom Lane
Date:
Jack Orenstein <jao@geophile.com> writes:
> I'm looking at one case in which two successive transactions, each
> updating a handful of records, take 26 and 18 *seconds* (not msec) to
> complete. These transactions normally complete in under 30 msec.

I've seen installations in which it seemed that the "normal" query load
was close to saturating the available disk bandwidth, and the extra load
imposed by a background VACUUM just pushed the entire system's response
time over a cliff.  In an installation that has I/O capacity to spare,
a VACUUM doesn't really hurt foreground query response at all.

I suspect that the same observations hold true for checkpoints, though
I haven't specifically seen an installation suffering from that effect.

Already-committed changes for 7.5 include a background writer, which
basically will "trickle" out dirty pages between checkpoints, thereby
hopefully reducing the volume of I/O forced at a checkpoint.  We have
also got code in place that throttles the rate of I/O requests during
VACUUM.  It seems like it might be useful to similarly throttle the I/O
request rate during a CHECKPOINT, though I'm not sure if there'd be any
bad side effects from lengthening the elapsed time for a checkpoint.
(Jan, any thoughts here?)

None of this is necessarily going to fix matters for an installation
that has no spare I/O capacity, though.  And from the numbers you're
quoting I fear you may be in that category.  "Buy faster disks" may
be the only answer ...

            regards, tom lane

Re: Configuring PostgreSQL to minimize impact of checkpoints

From
"Matthew Nuzum"
Date:
>
> Jack Orenstein <jao@geophile.com> writes:
> > I'm looking at one case in which two successive transactions, each
> > updating a handful of records, take 26 and 18 *seconds* (not msec) to
> > complete. These transactions normally complete in under 30 msec.
...
> None of this is necessarily going to fix matters for an installation
> that has no spare I/O capacity, though.  And from the numbers you're
> quoting I fear you may be in that category.  "Buy faster disks" may
> be the only answer ...
>

I had a computer once that had an out-of-the-box hard drive configuration
that provided horrible disk performance.  I found a tutorial at O'Reilly
that explained how to use hdparm to dramatically speed up disk performance
on Linux.  I've noticed on other computers I've set up recently that hdparm
seems to be used by default out of the box to give good performance.

Maybe your computer is using all of it's I/O capacity because it's using PIO
mode or some other non-optimal method of accessing the disk.

Just a suggestion, I hope it helps,

Matthew Nuzum        | ISPs: Make $200 - $5,000 per referral by
www.followers.net        | recomending Elite CMS to your customers!
matt@followers.net    | http://www.followers.net/isp


Re: Configuring PostgreSQL to minimize impact of checkpoints

From
Tom Lane
Date:
Jan Wieck <JanWieck@Yahoo.com> writes:
> If we would combine the background writer and the checkpointer,

... which in fact is on my agenda of things to do ...

> then a
> "checkpoint flush" could actually be implemented as a temporary change
> in that activity that basically is done by not reevaluating the list of
> to be flushed blocks any more and switching to a constant amount of
> blocks flushed per cycle. When that list get's empty, the checkpoint
> flush is done, the checkpoint can complete and the background writer
> resumes normal business.

Sounds like a plan.  I'll do it that way.  However, we might want to
have different configuration settings controlling the write rate during
checkpoint and the rate during normal background writing --- what do you
think?

Also, presumably a shutdown checkpoint should just whomp out the data
without any delays.  We can't afford to wait around and risk having
init decide we took too long.

>> None of this is necessarily going to fix matters for an installation
>> that has no spare I/O capacity, though.

> As a matter of fact, the background writer increases the overall IO. It
> writes buffers that possibly get modified again before a checkpoint or
> their replacement requires them to be finally written. So if there is no
> spare IO bandwidth, it makes things worse.

Right, the trickle writes could be wasted effort.

            regards, tom lane

Re: Configuring PostgreSQL to minimize impact of checkpoints

From
Rob Fielding
Date:
Matthew Nuzum wrote:
>>Jack Orenstein <jao@geophile.com> writes:
>>
>>>I'm looking at one case in which two successive transactions, each
>>>updating a handful of records, take 26 and 18 *seconds* (not msec) to
>>>complete. These transactions normally complete in under 30 msec.

>>None of this is necessarily going to fix matters for an installation
>>that has no spare I/O capacity, though.  And from the numbers you're
>>quoting I fear you may be in that category.  "Buy faster disks" may
>>be the only answer ...
>>

> I had a computer once that had an out-of-the-box hard drive configuration
> that provided horrible disk performance.  I found a tutorial at O'Reilly
> that explained how to use hdparm to dramatically speed up disk performance
> on Linux.  I've noticed on other computers I've set up recently that hdparm
> seems to be used by default out of the box to give good performance.
>
> Maybe your computer is using all of it's I/O capacity because it's using PIO
> mode or some other non-optimal method of accessing the disk.

There's certainly some scope there. I have an SGI Octane whos SCSI 2
disks were set-up by default with no write buffer and CTQ depth of zero
:/ IDE drivers in Linux maybe not detecting your IDE chipset correctly
and stepping down, however unlikely there maybe something odd going on
but you could check hdparm out. Ensure correct cables too, and the
aren't crushed or twisted too bad.... I digress...

Assuming you're running with optimal schema and index design (ie you're
not doing extra work unnecessarily), and your backend has
better-then-default config options set-up (plenty of tips around here),
then disk arrangement is critical to smoothing the ride.

Taking things to a relative extreme, we implemented a set-up with issues
similar sounding to yours. It was resolved by first optimising
everything but hardware, then finally optimising hardware. This served
us because it meant we squeezed as much out of the available hardware,
before finally throwing more at it, getting us the best possible returns
(plus further post optimisation on the new hardware).

First tip would to take your pg_xlog and put it on another disk (and
channel). Next if you're running a journalled fs, get that journal off
onto another disk (and channel). Finally, get as many disks for the data
store and spread the load across spindles. You're aiming here to
distribute the contention and disk I/O more evenly to remove the
congestion. sar and iostat help out as part of the analysis.

You say you're using IDE, for which I'd highly recommend switching to
SCSI and mutliple controllers because IDE isn't great for lots of other
reasons. Obviously budgets count, and playing with SCSI certainly limits
that. We took a total of 8 disks across 2 SCSI 160 channels and split up
the drives into a number of software RAID arrays. RAID0 mirrors for the
os, pg_xlog, data disk journal and swap and the rest became a RAID5
array for the data. You could instead implement your DATA disk as
RAID1+0 if you wanted more perf at the cost of free space. Anyway, it's
certainly not the fastest config out there, but it made all the
difference to this particular application. Infact, we had so much free
I/O we recently installed another app on there (based on mysql, sorry)
which runs concurrently, and itself 4 times faster than it originally did...

YMMV, just my 2p.

--

Rob Fielding
rob@dsvr.net

www.dsvr.co.uk              Development             Designer Servers Ltd


Re: Configuring PostgreSQL to minimize impact of checkpoints

From
jao@geophile.com
Date:
Quoting Rob Fielding <rob@dsvr.net>:

> Assuming you're running with optimal schema and index design (ie you're
> not doing extra work unnecessarily), and your backend has
> better-then-default config options set-up (plenty of tips around here),
> then disk arrangement is critical to smoothing the ride.

The schema and queries are extremely simple. I've been experimenting
with config options. One possibility I'm looking into is whether
shared_buffers is too high, at 12000. We have some preliminary evidence
that setting it lower (1000) reduces the demand for IO bandwidth to
a point where the spikes become almost tolerable.

> First tip would to take your pg_xlog and put it on another disk (and
> channel).

That's on my list of things to try.

> Next if you're running a journalled fs, get that journal off
> onto another disk (and channel). Finally, get as many disks for the data
> store and spread the load across spindles.

Dumb question: how do I spread the data across spindles? Do you have
a pointer to something I could read?

Jack Orenstein

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

Re: Configuring PostgreSQL to minimize impact of checkpoints

From
"scott.marlowe"
Date:
On Tue, 11 May 2004 jao@geophile.com wrote:

> Quoting Rob Fielding <rob@dsvr.net>:
>
> > Assuming you're running with optimal schema and index design (ie you're
> > not doing extra work unnecessarily), and your backend has
> > better-then-default config options set-up (plenty of tips around here),
> > then disk arrangement is critical to smoothing the ride.
>
> The schema and queries are extremely simple. I've been experimenting
> with config options. One possibility I'm looking into is whether
> shared_buffers is too high, at 12000. We have some preliminary evidence
> that setting it lower (1000) reduces the demand for IO bandwidth to
> a point where the spikes become almost tolerable.

If the shared_buffers are large, postgresql seems to have a performance
issue with handling them.  Plus they can cause the kernel to dump cache on
things that would otherwise be right there and therefore forces the
database to hit the drives.  You might wanna try settings between 1000 and
10000 and see where your sweet spot is.

> > First tip would to take your pg_xlog and put it on another disk (and
> > channel).
>
> That's on my list of things to try.
>
> > Next if you're running a journalled fs, get that journal off
> > onto another disk (and channel). Finally, get as many disks for the data
> > store and spread the load across spindles.
>
> Dumb question: how do I spread the data across spindles? Do you have
> a pointer to something I could read?

Look into a high quality hardware RAID controller with battery backed
cache on board.  We use the ami/lsi megaraid and I'm quite pleased with
its writing performance.

How you configure your drives is up to you.  For smaller numbers of
drives (6 or less) RAID 1+0 is usually a clear winner.  For medium numbers
of drives, say 8 to 20, RAID 5 works well.  For more drives than that,
many folks report RAID 5+0 or 0+5 to work well.

I've only played around with 12 or fewer drives, so I'm saying RAID 5+0 is
a good choice from my experience, just reflecting back what I've heard
here on the performance mailing list.

If you're not doing much writing, then a software RAID may be a good
intermediate solution, especially RAID1 with >2 disks under linux seems a
good setup for a mostly read database.


Re: Configuring PostgreSQL to minimize impact of checkpoints

From
Paul Tuckfield
Date:
The king of statistics in these cases, is probably vmstat.  one can
drill down on specific things from there, but first you should send
some vmstat output.

Reducing cache -> reducing IO suggests to me the OS might be paging out
shared buffers.   This is indicated by activity in the "si" and "so"
columns of vmstat.   intentional disk activity by the
applciation(postgres) shows up in the     "bi" and "bo" columns.

If you are having a "write storm" or bursty writes that's burying
performance, a scsi raid controler with writeback cache will greatly
improve the situation, but I do believe they run around $1-2k.   If
it's write specific problem, the cache matters more than the striping,
except to say that write specfic perf problems should avoid raid5

please send the output of "vmstat 10" for about 10 minutes, spanning
good performance and bad performance.





On May 11, 2004, at 9:52 AM, jao@geophile.com wrote:

> Quoting Rob Fielding <rob@dsvr.net>:
>
>> Assuming you're running with optimal schema and index design (ie
>> you're
>> not doing extra work unnecessarily), and your backend has
>> better-then-default config options set-up (plenty of tips around
>> here),
>> then disk arrangement is critical to smoothing the ride.
>
> The schema and queries are extremely simple. I've been experimenting
> with config options. One possibility I'm looking into is whether
> shared_buffers is too high, at 12000. We have some preliminary evidence
> that setting it lower (1000) reduces the demand for IO bandwidth to
> a point where the spikes become almost tolerable.
>
>> First tip would to take your pg_xlog and put it on another disk (and
>> channel).
>
> That's on my list of things to try.
>
>> Next if you're running a journalled fs, get that journal off
>> onto another disk (and channel). Finally, get as many disks for the
>> data
>> store and spread the load across spindles.
>
> Dumb question: how do I spread the data across spindles? Do you have
> a pointer to something I could read?
>
> Jack Orenstein
>
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>


Re: Configuring PostgreSQL to minimize impact of checkpoints

From
"scott.marlowe"
Date:
On Tue, 11 May 2004, Paul Tuckfield wrote:

> If you are having a "write storm" or bursty writes that's burying
> performance, a scsi raid controler with writeback cache will greatly
> improve the situation, but I do believe they run around $1-2k.   If
> it's write specific problem, the cache matters more than the striping,
> except to say that write specfic perf problems should avoid raid5

Actually, a single channel MegaRAID 320-1 (single channel ultra 320) is
only $421 at http://www.siliconmechanics.com/c248/u320-scsi.php  It works
pretty well for me, having 6 months of a production server on one with
zero hickups and very good performance.  They have a dual channel intel
card for only $503, but I'm not at all familiar with that card.

The top of the line megaraid is the 320-4, which is only $1240, which
ain't bad for a four channel RAID controller.

Battery backed cache is an addon, but I think it's only about $80 or so.


Re: Configuring PostgreSQL to minimize impact of checkpoints

From
"Rob Sell"
Date:

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of scott.marlowe
Sent: Tuesday, May 11, 2004 2:23 PM
To: Paul Tuckfield
Cc: jao@geophile.com; Matthew Nuzum; pgsql-performance@postgresql.org; Rob
Fielding
Subject: Re: [PERFORM] Configuring PostgreSQL to minimize impact of
checkpoints

On Tue, 11 May 2004, Paul Tuckfield wrote:

> If you are having a "write storm" or bursty writes that's burying
> performance, a scsi raid controler with writeback cache will greatly
> improve the situation, but I do believe they run around $1-2k.   If
> it's write specific problem, the cache matters more than the striping,
> except to say that write specfic perf problems should avoid raid5

Actually, a single channel MegaRAID 320-1 (single channel ultra 320) is
only $421 at http://www.siliconmechanics.com/c248/u320-scsi.php  It works
pretty well for me, having 6 months of a production server on one with
zero hickups and very good performance.  They have a dual channel intel
card for only $503, but I'm not at all familiar with that card.

The top of the line megaraid is the 320-4, which is only $1240, which
ain't bad for a four channel RAID controller.

Battery backed cache is an addon, but I think it's only about $80 or so.


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to majordomo@postgresql.org so that your
      message can get through to the mailing list cleanly

-----------------------------

If you don't mind slumming on ebay :-) keep an eye out for PERC III cards,
they are dell branded LSI cards. Perc = Power Edge Raid Controller. There
are models on there dual channel u320 and dell usually sells them with
battery backed cache.  That's how I have acquired all my high end raid
cards.

Rob


Re: Configuring PostgreSQL to minimize impact of checkpoints

From
Paul Tuckfield
Date:
Love that froogle.

It looks like a nice card.     One thing I didn't get straight is if
the cache is writethru or write back.

If the original posters problem is truly a burst write problem (and not
linux caching or virtual memory overcommitment) then writeback is key.




> On Tue, 11 May 2004, Paul Tuckfield wrote:
>
>> If you are having a "write storm" or bursty writes that's burying
>> performance, a scsi raid controler with writeback cache will greatly
>> improve the situation, but I do believe they run around $1-2k.   If
>> it's write specific problem, the cache matters more than the striping,
>> except to say that write specfic perf problems should avoid raid5
>
> Actually, a single channel MegaRAID 320-1 (single channel ultra 320) is
> only $421 at http://www.siliconmechanics.com/c248/u320-scsi.php  It
> works
> pretty well for me, having 6 months of a production server on one with
> zero hickups and very good performance.  They have a dual channel intel
> card for only $503, but I'm not at all familiar with that card.
>
> The top of the line megaraid is the 320-4, which is only $1240, which
> ain't bad for a four channel RAID controller.
>
> Battery backed cache is an addon, but I think it's only about $80 or
> so.
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>


Re: Configuring PostgreSQL to minimize impact of checkpoints

From
"scott.marlowe"
Date:
On Tue, 11 May 2004, Paul Tuckfield wrote:

> Love that froogle.
>
> It looks like a nice card.     One thing I didn't get straight is if
> the cache is writethru or write back.
>
> If the original posters problem is truly a burst write problem (and not
> linux caching or virtual memory overcommitment) then writeback is key.

the MegaRaid can be configured either way.  it defaults to writeback if
the battery backed cache is present, I believe.


Re: Configuring PostgreSQL to minimize impact of checkpoints

From
"scott.marlowe"
Date:
On Tue, 11 May 2004, Rob Sell wrote:


>
> If you don't mind slumming on ebay :-) keep an eye out for PERC III cards,
> they are dell branded LSI cards. Perc = Power Edge Raid Controller. There
> are models on there dual channel u320 and dell usually sells them with
> battery backed cache.  That's how I have acquired all my high end raid
> cards.

Not all Perc3s are lsi, many are adaptec.  The perc3di is adaptec, the
perc3dc is lsi/megaraid.


Re: Configuring PostgreSQL to minimize impact of

From
"J. Andrew Rogers"
Date:
On Tue, 2004-05-11 at 14:52, Paul Tuckfield wrote:
> Love that froogle.
>
> It looks like a nice card.     One thing I didn't get straight is if
> the cache is writethru or write back.


The LSI MegaRAID reading/writing/caching behavior is user configurable.
It will support both write-back and write-through, and IIRC, three
different algorithms for reading (none, read-ahead, adaptive).  Plenty
of configuration options.

It is a pretty mature and feature complete hardware RAID implementation.


j. andrew rogers


Re: Configuring PostgreSQL to minimize impact of checkpoints

From
Vivek Khera
Date:
>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes:

TL> Jack Orenstein <jao@geophile.com> writes:
>> I'm looking at one case in which two successive transactions, each
>> updating a handful of records, take 26 and 18 *seconds* (not msec) to
>> complete. These transactions normally complete in under 30 msec.

TL> I've seen installations in which it seemed that the "normal" query load
TL> was close to saturating the available disk bandwidth, and the extra load
TL> imposed by a background VACUUM just pushed the entire system's response
TL> time over a cliff.  In an installation that has I/O capacity to spare,

me stand up waving hand... ;-)  This is my only killer problem left.
I always peg my disk usage at 100% when vacuum runs, and other queries
are slow too.  When not running vacuum, my queries are incredibly
zippy fast, including joins and counts and group by's on upwards of
100k rows at a time.

TL> I suspect that the same observations hold true for checkpoints, though
TL> I haven't specifically seen an installation suffering from that effect.

I don't see that.  But I also set checkpoint segments to about 50 on
my big server.



--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD  +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Configuring PostgreSQL to minimize impact of

From
Vivek Khera
Date:
>>>>> "JAR" == J Andrew Rogers <jrogers@neopolitan.com> writes:


JAR> The LSI MegaRAID reading/writing/caching behavior is user configurable.
JAR> It will support both write-back and write-through, and IIRC, three
JAR> different algorithms for reading (none, read-ahead, adaptive).  Plenty
JAR> of configuration options.

For PG max performance, you want to set it for write-back and
read-ahead (adaptive has been claimed to be bad, but I got similar
performace from read-ahead and adaptive, so YMMV).


--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD  +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Configuring PostgreSQL to minimize impact of checkpoints

From
jao@geophile.com
Date:
Quoting Vivek Khera <khera@kcilink.com>:

> >>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>
> TL> Jack Orenstein <jao@geophile.com> writes:
> >> I'm looking at one case in which two successive transactions, each
> >> updating a handful of records, take 26 and 18 *seconds* (not msec) to
> >> complete. These transactions normally complete in under 30 msec.
>
> TL> I've seen installations in which it seemed that the "normal" query load
> TL> was close to saturating the available disk bandwidth, and the extra load
> TL> imposed by a background VACUUM just pushed the entire system's response
> TL> time over a cliff.  In an installation that has I/O capacity to spare,
> ...
> TL> I suspect that the same observations hold true for checkpoints, though
> TL> I haven't specifically seen an installation suffering from that effect.
>
> I don't see that.  But I also set checkpoint segments to about 50 on
> my big server.

But wouldn't that affect checkpoint frequency, not checkpoint cost?

Jack Orenstein


----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

Re: Configuring PostgreSQL to minimize impact of checkpoints

From
Vivek Khera
Date:
On May 12, 2004, at 3:22 PM, jao@geophile.com wrote:

>>
>> I don't see that.  But I also set checkpoint segments to about 50 on
>> my big server.
>
> But wouldn't that affect checkpoint frequency, not checkpoint cost

Seems reasonable.  I suppose checkpointing doesn't cost as much disk
I/O as vacuum does.  My checkpoints are also on a separate RAID volume
on a separate RAID channel, so perhaps that gives me extra bandwidth to
perform the checkpoints.