Thread: Trying to minimize the impact of checkpoints (resend)

Trying to minimize the impact of checkpoints (resend)

From
jao@geophile.com
Date:
[Sorry if this is a repeat. I think the first message may have
been rejected due to an attachment.]

I'm using PostgreSQL 7.3.4 on RH9. Data and logs are on separate
disks. (These are low-end IDE disks. That part of the problem
is out of my control.)

When a checkpoint occurs, all operations slow way, way down.
iostat of the data disk shows that, during a checkpoint, reads/sec
drops from 25-30 to under 0.5. Writes/sec go up, from 40-45
before the checkpoint, to 80-85 during. My test program does
a mixture of 1/2 reads and 1/2 inserts, so it basically comes
to a stop during checkpoints.

What can I do about this? The variability in read and insert times is
really hurting us. I know how to make checkpoints less frequent. I
know that the background writer will show up in 7.5. But what can I do
now?

Does anyone have any experience in modifying the priority of the
checkpoint process itself, (re-nicing it)?

- Would this be effective in slowing down checkpointing, allowing
concurrent work to get done more quickly?

- Is this a dangerous thing to do?

- How would it be done? (From outside postgresql if possible, but
we'll tweak the source if necessary.)

Jack Orenstein

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

Re: Trying to minimize the impact of checkpoints (resend)

From
Doug McNaught
Date:
jao@geophile.com writes:

> Does anyone have any experience in modifying the priority of the
> checkpoint process itself, (re-nicing it)?

Unfortunately for you, re-nicing doesn't generally affect a processes
I/O rate--it's meant for CPU-bound processes.

It might be possible to add code to "throttle" the checkpoint process
(similar to what was done for VACUUM) but I don't know for sure...

-Doug

Re: Trying to minimize the impact of checkpoints (resend)

From
Tom Lane
Date:
jao@geophile.com writes:
> I'm using PostgreSQL 7.3.4 on RH9. Data and logs are on separate
> disks. (These are low-end IDE disks. That part of the problem
> is out of my control.)

> When a checkpoint occurs, all operations slow way, way down.

Not too surprising; you haven't got enough I/O bandwidth.

> Does anyone have any experience in modifying the priority of the
> checkpoint process itself, (re-nicing it)?

That would be a waste of time, because your problem is with I/O usage
not CPU usage, and nice doesn't impact I/O scheduling AFAIK.

You might be able to get somewhere by inserting intrapage delays into
the checkpoint write loop, similar to what's been done to VACUUM since
7.4.  (I have a todo item to do this for CVS tip, in fact.)  You'd not
want this to happen during a shutdown checkpoint, but for ordinary
checkpoints I don't believe there's any problem with spacing out the
writes.

            regards, tom lane

Re: Trying to minimize the impact of checkpoints (resend)

From
"Gregory S. Williamson"
Date:
There is something wonky on this mail list. I did not send this.


-----Original Message-----
From:    Gregory S. Williamson
Sent:    Fri 6/11/2004 2:10 PM
To:    jao@geophile.com
Cc:    pgsql-general@postgresql.org
Subject:    Re: [GENERAL] Trying to minimize the impact of checkpoints (resend)
In-reply-to: <1086983714.40ca0e22a9cb4@geophile.com>
References: <1086983714.40ca0e22a9cb4@geophile.com>
Comments: In-reply-to jao@geophile.commessage dated "Fri, 11 Jun 2004
    15:55:14 -0400"
Date: Fri, 11 Jun 2004 16:42:19 -0400
Message-ID: <18295.1086986539@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, hits=0.0 tagged_above=0.0 required=5.0 tests=
X-Spam-Level:
X-Mailing-List: pgsql-general
Precedence: bulk
Sender: pgsql-general-owner@postgresql.org
X-imss-version: 2.5
X-imss-result: Passed
X-imss-scores: Clean:99.90000 C:15 M:2 S:5 R:5
X-imss-settings: Baseline:2 C:2 M:2 S:2 R:2 (0.1500 0.1500)
Return-Path: pgsql-general-owner+M61832@postgresql.org
X-OriginalArrivalTime: 11 Jun 2004 20:51:04.0152 (UTC) FILETIME=[CF961D80:01C44FF5]

jao@geophile.com writes:
> I'm using PostgreSQL 7.3.4 on RH9. Data and logs are on separate
> disks. (These are low-end IDE disks. That part of the problem
> is out of my control.)

> When a checkpoint occurs, all operations slow way, way down.

Not too surprising; you haven't got enough I/O bandwidth.

> Does anyone have any experience in modifying the priority of the
> checkpoint process itself, (re-nicing it)?

That would be a waste of time, because your problem is with I/O usage
not CPU usage, and nice doesn't impact I/O scheduling AFAIK.

You might be able to get somewhere by inserting intrapage delays into
the checkpoint write loop, similar to what's been done to VACUUM since
7.4.  (I have a todo item to do this for CVS tip, in fact.)  You'd not
want this to happen during a shutdown checkpoint, but for ordinary
checkpoints I don't believe there's any problem with spacing out the
writes.

            regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org