Re: Load Distributed Checkpoints, take 3 - Mailing list pgsql-patches

From Gregory Stark
Subject Re: Load Distributed Checkpoints, take 3
Date
Msg-id 878xa6tk4j.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: Load Distributed Checkpoints, take 3  (Greg Smith <gsmith@gregsmith.com>)
Responses Re: Load Distributed Checkpoints, take 3
List pgsql-patches
"Greg Smith" <gsmith@gregsmith.com> writes:

> If you write them twice, so what? You didn't even get to that point as an
> option until all the important stuff was taken care of and the system was
> near idle.

Well even if it's near idle you were still occupying the i/o system for a few
milliseconds. If someone else came in with a block request at that time you
extended their response time by that much.

> The elimination of the all-scan background writer means that true hot and dirty
> spots in the buffer cache, like popular index blocks on a heavily updated table
> that never get a zero usage_count, are never going to be written out other than
> as part of the checkpoint process.

If they're really popular blocks on a heavily updated table then they really
don't buy us anything to write them out unnecessarily.

The case where they help us is when they weren't really popular but we're not
doing enough to get around to writing them out and then when we do need to
write it out the system's busy. In that case we wasted a chance to write them
out when the system was more idle.

But don't forget that we're still going through the OS's buffers. That will
smooth out a lot of the i/o we generate anyways. Just because Postgres is idle
doesn't mean the OS isn't busy flushing the buffers we wrote out when it was
busy.

> That's OK for now, but I'd like it to be the case that one day the
> database's I/O scheduling would eventually get to those, in order to
> optimize performance in the kind of bursty scenarios I've been mentioning
> lately.

I think the feeling is that the bursty scenarios are really corner cases.
Heikki described a case where you're dirtying tons of blocks without
triggering any WAL. That can happen but it's pretty peculiar.

I can imagine a scenario where you have a system that's very busy for 60s and
then idle for 60s repeatedly. And for some reason you configure a
checkpoint_timeout on the order of 20m or so (assuming you're travelling
precisely 60mph).

In that scenario bgwriter's lru scan has to fight to keep up for 60s while
it's mostly writing out dirty pages that it could have flushed out during the
idle time. Effectively you're only making use of half the i/o bandwidth since
bgwriter doesn't do any work for half the duty cycle.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


pgsql-patches by date:

Previous
From: Greg Smith
Date:
Subject: Re: Load Distributed Checkpoints, final patch
Next
From: Alvaro Herrera
Date:
Subject: Re: Load Distributed Checkpoints, take 3