Re: WAL prefetch - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: WAL prefetch |
Date | |
Msg-id | ab72fe1b-53d1-30fb-a2c6-a846ca3a9beb@2ndquadrant.com Whole thread Raw |
In response to | Re: WAL prefetch (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
List | pgsql-hackers |
On 06/19/2018 04:50 PM, Konstantin Knizhnik wrote: > > > On 19.06.2018 16:57, Ants Aasma wrote: >> On Tue, Jun 19, 2018 at 4:04 PM Tomas Vondra >> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> >> wrote: >> >> Right. My point is that while spawning bgworkers probably helps, I >> don't >> expect it to be enough to fill the I/O queues on modern storage >> systems. >> Even if you start say 16 prefetch bgworkers, that's not going to be >> enough for large arrays or SSDs. Those typically need way more >> than 16 >> requests in the queue. >> >> Consider for example [1] from 2014 where Merlin reported how S3500 >> (Intel SATA SSD) behaves with different effective_io_concurrency >> values: >> >> [1] >> https://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH=HWh1WbLRsioe=mzRJTHwtr=2azsTdQ@mail.gmail.com >> >> Clearly, you need to prefetch 32/64 blocks or so. Consider you may >> have >> multiple such devices in a single RAID array, and that this device is >> from 2014 (and newer flash devices likely need even deeper queues).' >> >> >> For reference, a typical datacenter SSD needs a queue depth of 128 to >> saturate a single device. [1] Multiply that appropriately for RAID >> arrays.So > > How it is related with results for S3500 where this is almost now > performance improvement for effective_io_concurrency >8? > Starting 128 or more workers for performing prefetch is definitely not > acceptable... > I'm not sure what you mean by "almost now performance improvement", but I guess you meant "almost no performance improvement" instead? If that's the case, it's not quite true - increasing the queue depth above 8 further improved the throughput by about ~10-20% (both by duration and peak throughput measured by iotop). But more importantly, this is just a single device - you typically have multiple of them in a larger arrays, to get better capacity, performance and/or reliability. So if you have 16 such drives, and you want to send at least 8 requests to each, suddenly you need at least 128 requests. And as pointed out before, S3500 is about 5-years old device (it was introduced in Q2/2013). On newer devices the difference is usually way more significant / the required queue depth is much higher. Obviously, this is a somewhat simplified view, ignoring various details (e.g. that there may be multiple concurrent queries, each sending I/O requests - what matters is the combined number of requests, of course). But I don't think this makes a huge difference. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: