Re: checkpointer continuous flushing - V18 - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: checkpointer continuous flushing - V18
Date
Msg-id alpine.DEB.2.10.1602210853540.3927@sto
Whole thread Raw
In response to Re: checkpointer continuous flushing - V18  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
Hallo Andres,

>>> [...] I do think that this whole writeback logic really does make sense 
>>> *per table space*,
>> 
>> Leads to less regular IO, because if your tablespaces are evenly sized
>> (somewhat common) you'll sometimes end up issuing sync_file_range's
>> shortly after each other.  For latency outside checkpoints it's
>> important to control the total amount of dirty buffers, and that's
>> obviously independent of tablespaces.
>
> I do not understand/buy this argument.
>
> The underlying IO queue is per device, and table spaces should be per device 
> as well (otherwise what the point?), so you should want to coalesce and 
> "writeback" pages per device as wel. Calling sync_file_range on distinct 
> devices should probably be issued more or less randomly, and should not 
> interfere one with the other.
>
> If you use just one context, the more table spaces the less performance 
> gains, because there is less and less aggregation thus sequential writes per 
> device.
>
> So for me there should really be one context per tablespace. That would 
> suggest a hashtable or some other structure to keep and retrieve them, which 
> would not be that bad, and I think that it is what is needed.

Note: I think that an easy way to do that in the "checkpoint sort" patch 
is simply to keep a WritebackContext in CkptTsStatus structure which is 
per table space in the checkpointer.

For bgwriter & backends it can wait, there is few "writeback" coalescing 
because IO should be pretty random, so it does not matter much.

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: checkpointer continuous flushing - V18
Next
From: Pavel Stehule
Date:
Subject: Re: plpgsql - DECLARE - cannot to use %TYPE or %ROWTYPE for composite types