Re: checkpointer continuous flushing - Mailing list pgsql-hackers

From Andres Freund
Subject Re: checkpointer continuous flushing
Date
Msg-id 20151112170540.GG27477@alap3.anarazel.de
Whole thread Raw
In response to Re: checkpointer continuous flushing  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: checkpointer continuous flushing  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
On 2015-11-12 17:44:40 +0100, Fabien COELHO wrote:
> 
> >>To fix it, ITSM that it is enough to hold a "do not close lock" on the file
> >>while a flush is in progress (a short time) that would prevent mdclose to do
> >>its stuff.
> >
> >Could you expand a bit more on this? You're suggesting something like a
> >boolean in the vfd struct?
> 
> Basically yes, I'm suggesting a mutex in the vdf struct.

I can't see that being ok. I mean what would that thing even do? VFD
isn't shared between processes, and if we get a smgr flush we have to
apply it, or risk breaking other things.

> >* my laptop, 16 GB Ram, 840 EVO 1TB as storage. With 2GB
> > shared_buffers. Tried checkpoint timeouts from 60 to 300s.
> 
> Hmmm. This is quite short.

Indeed. I'd never do that in a production scenario myself. But
nonetheless it showcases a problem.


> >Well, you can't easily sort bgwriter/backend writes stemming from cache
> >replacement. Unless your access patterns are entirely sequential the
> >data in shared buffers will be laid out in a nearly entirely random
> >order.  We could try sorting the data, but with any reasonable window,
> >for many workloads the likelihood of actually achieving much with that
> >seems low.
> 
> Maybe the sorting could be shared with others so that everybody uses the
> same order?
> 
> That would suggest to have one global sorting of buffers, maybe maintained
> by the checkpointer, which could be used by all processes that need to scan
> the buffers (in file order), instead of scanning them in memory order.

Uh. Cache replacement is based on an approximated LRU, you can't just
remove that without serious regressions.


> >>Hmmm. The shorter the timeout, the more likely the sorting NOT to be
> >>effective
> >
> >You mean, as evidenced by the results, or is that what you'd actually
> >expect?
> 
> What I would expect...

I don't see why then? If you very quickly writes lots of data the OS
will continously flush dirty data to the disk, in which case sorting is
rather important?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: dinesh kumar
Date:
Subject: Re: Proposing COPY .. WITH PERMISSIVE
Next
From: Pavel Stehule
Date:
Subject: Re: psql: add \pset true/false