Thread: IO scheduler recommendation

IO scheduler recommendation

From
"AB_ba#"
Date:
Hello ,

I searched the complete PostgreSQL Documentation but didn't get anything with respect to IO scheduler recommendation.
What is being recommended by PostgreSQL ?
Which is the best IO scheduler considering the Data is hosted on NFS?


--
Thanks and Regards
ANUP BHARTI

Re: IO scheduler recommendation

From
Laurenz Albe
Date:
AB_ba# wrote:
> I searched the complete PostgreSQL Documentation but didn't get anything with respect to IO scheduler
recommendation.
> What is being recommended by PostgreSQL ?

There is no clear recommendation.

I personally have seen workloads where changing from "cfq" to "deadline"
or "noop" improved performance by a factor of 4, but on many systems "cfq"
seems to be doing at least as good as the others.

I believe that it depends a lot on your hardware configuration and
your workload, and you are best advised to run a realistic load test.

> Which is the best IO scheduler considering the Data is hosted on NFS?

No idea - probably depends on what is behind the NFS.

Make sure to use hard, fg mounts.
If you can, use "jumbo frames" so that an 8KB block can fit into
a single IP frame.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com



Re: IO scheduler recommendation

From
"AB_ba#"
Date:
Thanks Laurenz
Surprise to know that there are no official recommendation from PostgreSQL.

On Mon, Jan 21, 2019 at 9:14 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
AB_ba# wrote:
> I searched the complete PostgreSQL Documentation but didn't get anything with respect to IO scheduler recommendation.
> What is being recommended by PostgreSQL ?

There is no clear recommendation.

I personally have seen workloads where changing from "cfq" to "deadline"
or "noop" improved performance by a factor of 4, but on many systems "cfq"
seems to be doing at least as good as the others.

I believe that it depends a lot on your hardware configuration and
your workload, and you are best advised to run a realistic load test.

> Which is the best IO scheduler considering the Data is hosted on NFS?

No idea - probably depends on what is behind the NFS.

Make sure to use hard, fg mounts.
If you can, use "jumbo frames" so that an 8KB block can fit into
a single IP frame.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



--
Thanks and Regards
ANUP BHARTI

Re: IO scheduler recommendation

From
Tim Cross
Date:
I suspect the main reason there are no official recommendations is because setting the IO scheduler is a  low level optimisation, getting very close to a 'mciro-optimisation' compared to other areas of optimisation.  Some even consider playing at this level to be somewhat of a black art - primarily because it is very complicated and dependent on many, many variables (hardware, use profile, data profile etc). For these reasons, I doubt there is a clear 'winner' for PG (especially as PG doesn't do direct IO to disk like Oracle does).  I suspect the fact your using NFS will overshadow any differences with the IO scheduling algorithm as well.  Regardless, the only reliable way to select the best algorithm would be extensive benchmarking using realistic data and usage profiles. 

On Tue, 22 Jan 2019 at 16:38, AB_ba# <bharti.anup@gmail.com> wrote:
Thanks Laurenz
Surprise to know that there are no official recommendation from PostgreSQL.

On Mon, Jan 21, 2019 at 9:14 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
AB_ba# wrote:
> I searched the complete PostgreSQL Documentation but didn't get anything with respect to IO scheduler recommendation.
> What is being recommended by PostgreSQL ?

There is no clear recommendation.

I personally have seen workloads where changing from "cfq" to "deadline"
or "noop" improved performance by a factor of 4, but on many systems "cfq"
seems to be doing at least as good as the others.

I believe that it depends a lot on your hardware configuration and
your workload, and you are best advised to run a realistic load test.

> Which is the best IO scheduler considering the Data is hosted on NFS?

No idea - probably depends on what is behind the NFS.

Make sure to use hard, fg mounts.
If you can, use "jumbo frames" so that an 8KB block can fit into
a single IP frame.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



--
Thanks and Regards
ANUP BHARTI


--
regards,

Tim

--
Tim Cross

Re: IO scheduler recommendation

From
Ron
Date:
Isn't the use of NFS pretty high on the "things not to do with Postgres" list?

On 1/22/19 12:13 AM, Tim Cross wrote:
I suspect the main reason there are no official recommendations is because setting the IO scheduler is a  low level optimisation, getting very close to a 'mciro-optimisation' compared to other areas of optimisation.  Some even consider playing at this level to be somewhat of a black art - primarily because it is very complicated and dependent on many, many variables (hardware, use profile, data profile etc). For these reasons, I doubt there is a clear 'winner' for PG (especially as PG doesn't do direct IO to disk like Oracle does).  I suspect the fact your using NFS will overshadow any differences with the IO scheduling algorithm as well.  Regardless, the only reliable way to select the best algorithm would be extensive benchmarking using realistic data and usage profiles. 

On Tue, 22 Jan 2019 at 16:38, AB_ba# <bharti.anup@gmail.com> wrote:
Thanks Laurenz
Surprise to know that there are no official recommendation from PostgreSQL.

On Mon, Jan 21, 2019 at 9:14 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
AB_ba# wrote:
> I searched the complete PostgreSQL Documentation but didn't get anything with respect to IO scheduler recommendation.
> What is being recommended by PostgreSQL ?

There is no clear recommendation.

I personally have seen workloads where changing from "cfq" to "deadline"
or "noop" improved performance by a factor of 4, but on many systems "cfq"
seems to be doing at least as good as the others.

I believe that it depends a lot on your hardware configuration and
your workload, and you are best advised to run a realistic load test.

> Which is the best IO scheduler considering the Data is hosted on NFS?

No idea - probably depends on what is behind the NFS.

Make sure to use hard, fg mounts.
If you can, use "jumbo frames" so that an 8KB block can fit into
a single IP frame.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



--
Thanks and Regards
ANUP BHARTI


--
regards,

Tim

--
Tim Cross


--
Angular momentum makes the world go 'round.

Re: IO scheduler recommendation

From
"AB_ba#"
Date:
It all depends on your experience, There are number of deployments that 
are working great with NFS.


Also,IO scheduler settings applies to block devices not to NFS.

On Tue, Jan 22, 2019 at 12:42 PM Ron <ronljohnsonjr@gmail.com> wrote:
Isn't the use of NFS pretty high on the "things not to do with Postgres" list?

On 1/22/19 12:13 AM, Tim Cross wrote:
I suspect the main reason there are no official recommendations is because setting the IO scheduler is a  low level optimisation, getting very close to a 'mciro-optimisation' compared to other areas of optimisation.  Some even consider playing at this level to be somewhat of a black art - primarily because it is very complicated and dependent on many, many variables (hardware, use profile, data profile etc). For these reasons, I doubt there is a clear 'winner' for PG (especially as PG doesn't do direct IO to disk like Oracle does).  I suspect the fact your using NFS will overshadow any differences with the IO scheduling algorithm as well.  Regardless, the only reliable way to select the best algorithm would be extensive benchmarking using realistic data and usage profiles. 

On Tue, 22 Jan 2019 at 16:38, AB_ba# <bharti.anup@gmail.com> wrote:
Thanks Laurenz
Surprise to know that there are no official recommendation from PostgreSQL.

On Mon, Jan 21, 2019 at 9:14 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
AB_ba# wrote:
> I searched the complete PostgreSQL Documentation but didn't get anything with respect to IO scheduler recommendation.
> What is being recommended by PostgreSQL ?

There is no clear recommendation.

I personally have seen workloads where changing from "cfq" to "deadline"
or "noop" improved performance by a factor of 4, but on many systems "cfq"
seems to be doing at least as good as the others.

I believe that it depends a lot on your hardware configuration and
your workload, and you are best advised to run a realistic load test.

> Which is the best IO scheduler considering the Data is hosted on NFS?

No idea - probably depends on what is behind the NFS.

Make sure to use hard, fg mounts.
If you can, use "jumbo frames" so that an 8KB block can fit into
a single IP frame.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



--
Thanks and Regards
ANUP BHARTI


--
regards,

Tim

--
Tim Cross


--
Angular momentum makes the world go 'round.


--
Thanks and Regards
ANUP BHARTI

Re: IO scheduler recommendation

From
Laurenz Albe
Date:
Ron wrote:
> Isn't the use of NFS pretty high on the "things not to do with Postgres" list?

There is a strong sentiment against PostgreSQL on NFS, voiced by people
I tend to trust, but the only detailed horror story I have heard is about
a "bg" mounted NFS file system that wasn't mounted yet when PostgreSQL
was started, and the startup script decided to run "initdb", during which
the mount finally succeeded.

If you browse the archives you will read that "NFS is unreliable" and
"it depends on the implementation, but Linux' implementation is bad"
and such, but without any technical detail ever being mentioned.

That may not be the fault of the people who propagate these opinions -
perhaps they experienced database corruption using NFS, but don't know
exactly which part of NFS caused the problem exactly how.

There are some other voices that say that it works just fine, if you
configure it properly.

The feeling I get from all this is that it is an experimental field,
and everybody who wants to use it would be well advised to run tests
covering all kinds of crash scenarios under load.

I'd still be curious to know if there is someone who can supply
technical details about what *exactly* is wrong with NFS.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com



Re: IO scheduler recommendation

From
Arni Kromić
Date:
On 21/01/2019 12.04, AB_ba# wrote:
Hello ,

I searched the complete PostgreSQL Documentation but didn't get anything with respect to IO scheduler recommendation.
What is being recommended by PostgreSQL ?
Which is the best IO scheduler considering the Data is hosted on NFS?


--
Thanks and Regards
ANUP BHARTI

AFAIK, IO scheduler cannot be set for NFS; it's not a real block storage device.
-- 
Kind Regards,
Arni Kromić

Re: IO scheduler recommendation

From
Imre Samu
Date:
Phoronix website has some performance testing.

for example:  (2018 Dec)  "Linux 4.20 I/O Scheduler Benchmarks On NVMe SSD Storage"
"BFQ also picked up wins on the Samsung 970 EVO SSD when running the PostgreSQL database server."

(2017)"Linux 4.12 I/O Scheduler Benchmarks: BFQ, Kyber, Etc"
"The default CFQ I/O scheduler on this SATA 3.0 SSD system remained the fastest for this PostgreSQL benchmark."

Imre


AB_ba# <bharti.anup@gmail.com> ezt írta (időpont: 2019. jan. 21., H, 12:05):
Hello ,

I searched the complete PostgreSQL Documentation but didn't get anything with respect to IO scheduler recommendation.
What is being recommended by PostgreSQL ?
Which is the best IO scheduler considering the Data is hosted on NFS?


--
Thanks and Regards
ANUP BHARTI

Re: IO scheduler recommendation

From
Purav Chovatia
Date:
Hi,

IO scheduler is more of an OS thing and not a DB thing. Once the DB (be it Postgresql or any other) has submitted IO to OS it is up to the OS to make sure it is written in the fastest possible manner.

In RHEL 6, CFQ was the default and starting RHEL 7, Deadline is the default (I don’t know exactly why was the default changed). However, across rhel 6 & 7 it is suggested to have deadline for db environments. But what I separately read is that if the server is a VM, then NOOP is the best scheduler because you don’t want the VM as well as the host, both to be scheduling IO which can actually work against. In that case, let the VM not do anything, instead let the host do the scheduling. 

HTH

On Mon, 28 Jan 2019 at 6:05 PM, Arni Kromić <arni.kromic@bios-ict.hr> wrote:
On 21/01/2019 12.04, AB_ba# wrote:
Hello ,

I searched the complete PostgreSQL Documentation but didn't get anything with respect to IO scheduler recommendation.
What is being recommended by PostgreSQL ?
Which is the best IO scheduler considering the Data is hosted on NFS?


--
Thanks and Regards
ANUP BHARTI

AFAIK, IO scheduler cannot be set for NFS; it's not a real block storage device.
-- 
Kind Regards,
Arni Kromić

Re: IO scheduler recommendation

From
Arni Kromić
Date:
On 28/01/2019 14.15, Imre Samu wrote:
Phoronix website has some performance testing.

for example:  (2018 Dec)  "Linux 4.20 I/O Scheduler Benchmarks On NVMe SSD Storage"
"BFQ also picked up wins on the Samsung 970 EVO SSD when running the PostgreSQL database server."

(2017)"Linux 4.12 I/O Scheduler Benchmarks: BFQ, Kyber, Etc"
"The default CFQ I/O scheduler on this SATA 3.0 SSD system remained the fastest for this PostgreSQL benchmark."

Imre


AB_ba# <bharti.anup@gmail.com> ezt írta (időpont: 2019. jan. 21., H, 12:05):
Hello ,

I searched the complete PostgreSQL Documentation but didn't get anything with respect to IO scheduler recommendation.
What is being recommended by PostgreSQL ?
Which is the best IO scheduler considering the Data is hosted on NFS?


--
Thanks and Regards
ANUP BHARTI

CFQ is supposed to be the best for servers anyway, because it is designed to ensure all services get fair (hence the name) amount of I/O transfer time, no matter what. Deadline is supposed to be better for interactive (e.g. desktop) use because it guarantees better response at the expense of overall throughput.

The above mentioned schedulers are variants of the Elevator algorithm designed to optimize data handling on a rotating platter HDD with moving heads. NOOP scheduler, as its name sugests, does nothing. It is supposed to be best for devices which are NOT rotating platter, moving heads HDDs. Its case uses are SSDs which don't have mechanical movements to optimize, RAID controllers which themselves control their physical drives and VMs where the host OS is responsible for that.

However all that is moot for NFS, which is used by the OP, for it is a network protocol, not a physical device. The question of schedulers is irrelevant because it CAN'T be set for NFS.
-- 
Kind Regards,
Arni Kromić