Thread: Initdb --data-checksums by default
Hello everyone!
Today in Big Data epoch silent data corruption becoming more and more issue to afraid of. With uncorrectable read error rate ~ 10^-15 on multiterabyte disk bit rot is the real issue.
I think that today checksumming data must be mandatory set by default. Only if someone doesn't care about his data he can manually turn this option off.
What do you think about defaulting --data-checksums in initdb?
Today in Big Data epoch silent data corruption becoming more and more issue to afraid of. With uncorrectable read error rate ~ 10^-15 on multiterabyte disk bit rot is the real issue.
I think that today checksumming data must be mandatory set by default. Only if someone doesn't care about his data he can manually turn this option off.
What do you think about defaulting --data-checksums in initdb?
-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 4/20/2016 12:43 AM, Alex Ignatov wrote:
Today in Big Data epoch silent data corruption becoming more and more issue to afraid of. With uncorrectable read error rate ~ 10^-15 on multiterabyte disk bit rot is the real issue.
are not those uncorrectable errors detected by the disk hardware ? thats not 'silent'.
whats the rate of uncorrectable AND undetected read errors ?
-- john r pierce, recycling bits in santa cruz
On Wed, Apr 20, 2016 at 4:43 PM, Alex Ignatov <a.ignatov@postgrespro.ru> wrote: > Hello everyone! > Today in Big Data epoch silent data corruption becoming more and more issue > to afraid of. With uncorrectable read error rate ~ 10^-15 on multiterabyte > disk bit rot is the real issue. > I think that today checksumming data must be mandatory set by default. > Only if someone doesn't care about his data he can manually turn this option > off. > > What do you think about defaulting --data-checksums in initdb? Not sure that most users deploying Postgres by default are ready to pay the price of data checksums in their default deployments. People using it are already using the -k switch, so it may actually be a trap to switch the default. -- Michael
On 20.04.2016 10:47, John R Pierce wrote:
Uncorrectable read error rate ~ 10^-15- 10^-14. This error stays undetected and uncorrectable.On 4/20/2016 12:43 AM, Alex Ignatov wrote:Today in Big Data epoch silent data corruption becoming more and more issue to afraid of. With uncorrectable read error rate ~ 10^-15 on multiterabyte disk bit rot is the real issue.
are not those uncorrectable errors detected by the disk hardware ? thats not 'silent'.
whats the rate of uncorrectable AND undetected read errors ?-- john r pierce, recycling bits in santa cruz
Also how can you associate disk block ->> file block without knowing fs structure? Also not every hardware include this checksum feature under the hood.
-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 20.04.2016 10:58, Michael Paquier wrote: > On Wed, Apr 20, 2016 at 4:43 PM, Alex Ignatov <a.ignatov@postgrespro.ru> wrote: >> Hello everyone! >> Today in Big Data epoch silent data corruption becoming more and more issue >> to afraid of. With uncorrectable read error rate ~ 10^-15 on multiterabyte >> disk bit rot is the real issue. >> I think that today checksumming data must be mandatory set by default. >> Only if someone doesn't care about his data he can manually turn this option >> off. >> >> What do you think about defaulting --data-checksums in initdb? > Not sure that most users deploying Postgres by default are ready to > pay the price of data checksums in their default deployments. People > using it are already using the -k switch, so it may actually be a trap > to switch the default. WALs also has performance issue. But we have it by default on minimal level. We also have CRC on WALs which also have performance hit. -- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Hi, On Wed, 2016-04-20 at 10:43 +0300, Alex Ignatov wrote: > Today in Big Data epoch silent data corruption becoming more and more > issue to afraid of. With uncorrectable read error rate ~ 10^-15 on > multiterabyte disk bit rot is the real issue. > I think that today checksumming data must be mandatory set by default. > Only if someone doesn't care about his data he can manually turn this > option off. > > What do you think about defaulting --data-checksums in initdb? I think this should be discussed in -hackers, right? Regards, -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Twitter: @DevrimGunduz , @DevrimGunduzTR
Attachment
On 20.04.2016 11:29, Devrim Gündüz wrote: > Hi, > > On Wed, 2016-04-20 at 10:43 +0300, Alex Ignatov wrote: >> Today in Big Data epoch silent data corruption becoming more and more >> issue to afraid of. With uncorrectable read error rate ~ 10^-15 on >> multiterabyte disk bit rot is the real issue. >> I think that today checksumming data must be mandatory set by default. >> Only if someone doesn't care about his data he can manually turn this >> option off. >> >> What do you think about defaulting --data-checksums in initdb? > I think this should be discussed in -hackers, right? > > Regards, May be you right but i want to know what people think about it before i'll write to hackers. -- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
På onsdag 20. april 2016 kl. 10:33:14, skrev Alex Ignatov <a.ignatov@postgrespro.ru>:
On 20.04.2016 11:29, Devrim Gündüz wrote:
> Hi,
>
> On Wed, 2016-04-20 at 10:43 +0300, Alex Ignatov wrote:
>> Today in Big Data epoch silent data corruption becoming more and more
>> issue to afraid of. With uncorrectable read error rate ~ 10^-15 on
>> multiterabyte disk bit rot is the real issue.
>> I think that today checksumming data must be mandatory set by default.
>> Only if someone doesn't care about his data he can manually turn this
>> option off.
>>
>> What do you think about defaulting --data-checksums in initdb?
> I think this should be discussed in -hackers, right?
>
> Regards,
May be you right but i want to know what people think about it before
i'll write to hackers.
-1 on changing the default.
10^15 ~= 1000 TB, which isn't very common yet. Those having it probably are aware of the risk and have enabled checksums already.
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
On 20.04.2016 11:40, Andreas Joseph Krogh wrote:
It is per bit not bytes. So it is ~100 TB. We working with some enterprise who have WALs creation rate ~ 4GB per min - so it is only max 100 days before you get bit rotted and have probability to get silent data corruption.På onsdag 20. april 2016 kl. 10:33:14, skrev Alex Ignatov <a.ignatov@postgrespro.ru>:
On 20.04.2016 11:29, Devrim Gündüz wrote:
> Hi,
>
> On Wed, 2016-04-20 at 10:43 +0300, Alex Ignatov wrote:
>> Today in Big Data epoch silent data corruption becoming more and more
>> issue to afraid of. With uncorrectable read error rate ~ 10^-15 on
>> multiterabyte disk bit rot is the real issue.
>> I think that today checksumming data must be mandatory set by default.
>> Only if someone doesn't care about his data he can manually turn this
>> option off.
>>
>> What do you think about defaulting --data-checksums in initdb?
> I think this should be discussed in -hackers, right?
>
> Regards,
May be you right but i want to know what people think about it before
i'll write to hackers.-1 on changing the default.10^15 ~= 1000 TB, which isn't very common yet. Those having it probably are aware of the risk and have enabled checksums already.--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Also don't forget that it is theoretical limit and Google tells us that HDD and SSD is not as reliable as manufactures tell. So this 10^-15 can easily be much higher.
-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
På onsdag 20. april 2016 kl. 11:02:31, skrev Alex Ignatov <a.ignatov@postgrespro.ru>:
On 20.04.2016 11:40, Andreas Joseph Krogh wrote:It is per bit not bytes. So it is ~100 TB. We working with some enterprise who have WALs creation rate ~ 4GB per min - so it is only max 100 days before you get bit rotted and have probability to get silent data corruption.På onsdag 20. april 2016 kl. 10:33:14, skrev Alex Ignatov <a.ignatov@postgrespro.ru>:
On 20.04.2016 11:29, Devrim Gündüz wrote:
> Hi,
>
> On Wed, 2016-04-20 at 10:43 +0300, Alex Ignatov wrote:
>> Today in Big Data epoch silent data corruption becoming more and more
>> issue to afraid of. With uncorrectable read error rate ~ 10^-15 on
>> multiterabyte disk bit rot is the real issue.
>> I think that today checksumming data must be mandatory set by default.
>> Only if someone doesn't care about his data he can manually turn this
>> option off.
>>
>> What do you think about defaulting --data-checksums in initdb?
> I think this should be discussed in -hackers, right?
>
> Regards,
May be you right but i want to know what people think about it before
i'll write to hackers.-1 on changing the default.10^15 ~= 1000 TB, which isn't very common yet. Those having it probably are aware of the risk and have enabled checksums already.--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Also don't forget that it is theoretical limit and Google tells us that HDD and SSD is not as reliable as manufactures tell. So this 10^-15 can easily be much higher.
Ok, but still - the case you're describing isn't the common-case for PG-users. Enterprises like that certainly chould use --data-checksums, I'm not arguing against that, just that it shouldn't be the default-setting.
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
On 20.04.2016 12:10, Andreas Joseph Krogh wrote:
Why do you think that common pg-users doesn't care about their data? Also why do we have wal_level=minimal fsync=on and other stuff?På onsdag 20. april 2016 kl. 11:02:31, skrev Alex Ignatov <a.ignatov@postgrespro.ru>:On 20.04.2016 11:40, Andreas Joseph Krogh wrote:It is per bit not bytes. So it is ~100 TB. We working with some enterprise who have WALs creation rate ~ 4GB per min - so it is only max 100 days before you get bit rotted and have probability to get silent data corruption.På onsdag 20. april 2016 kl. 10:33:14, skrev Alex Ignatov <a.ignatov@postgrespro.ru>:
On 20.04.2016 11:29, Devrim Gündüz wrote:
> Hi,
>
> On Wed, 2016-04-20 at 10:43 +0300, Alex Ignatov wrote:
>> Today in Big Data epoch silent data corruption becoming more and more
>> issue to afraid of. With uncorrectable read error rate ~ 10^-15 on
>> multiterabyte disk bit rot is the real issue.
>> I think that today checksumming data must be mandatory set by default.
>> Only if someone doesn't care about his data he can manually turn this
>> option off.
>>
>> What do you think about defaulting --data-checksums in initdb?
> I think this should be discussed in -hackers, right?
>
> Regards,
May be you right but i want to know what people think about it before
i'll write to hackers.-1 on changing the default.10^15 ~= 1000 TB, which isn't very common yet. Those having it probably are aware of the risk and have enabled checksums already.--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Also don't forget that it is theoretical limit and Google tells us that HDD and SSD is not as reliable as manufactures tell. So this 10^-15 can easily be much higher.Ok, but still - the case you're describing isn't the common-case for PG-users. Enterprises like that certainly chould use --data-checksums, I'm not arguing against that, just that it shouldn't be the default-setting.--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
På onsdag 20. april 2016 kl. 11:22:33, skrev Alex Ignatov <a.ignatov@postgrespro.ru>:
[snip]
Why do you think that common pg-users doesn't care about their data?
Did I say that?
Also why do we have wal_level=minimal fsync=on and other stuff?
To make certain garantees that data is by default durable.
What I'm saying is that everything is a compromise, cost/benefit. The universe might explode tomorrow but the chances are slim, so no use preparing for it.
Those caring enough probably use checksums, battery-packed RAID etc.
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
On 20.04.2016 12:27, Andreas Joseph Krogh wrote:
På onsdag 20. april 2016 kl. 11:22:33, skrev Alex Ignatov <a.ignatov@postgrespro.ru>:[snip]Why do you think that common pg-users doesn't care about their data?Did I say that?Also why do we have wal_level=minimal fsync=on and other stuff?To make certain garantees that data is by default durable.What I'm saying is that everything is a compromise, cost/benefit. The universe might explode tomorrow but the chances are slim, so no use preparing for it.Those caring enough probably use checksums, battery-packed RAID etc.--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Wal is not durable without crc in it. But we have by default crc in wal and not in data files.
-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 4/20/2016 1:00 AM, Alex Ignatov wrote:
Uncorrectable read error rate ~ 10^-15- 10^-14. This error stays undetected and uncorrectable.
what are your units here? disk IO is done by blocks, and the CRCs and trellis codes used by the disk drives for error detection and correction are applied on a whole block basis.
so is that one undetected error per 10^-14 blocks read, or what?
and what about file systems like ZFS that already do their own CRC ?
-- john r pierce, recycling bits in santa cruz
On 04/20/2016 02:22 AM, Alex Ignatov wrote: > > Why do you think that common pg-users doesn't care about their data? > Also why do we have wal_level=minimal fsync=on and other stuff? Because Postgres will not run without WAL files and the setting is the least you can have. Doing so you lose the ability to do archiving/streaming replication, which can be argued is a data safety issue. > > -- > Alex Ignatov > Postgres Professional:http://www.postgrespro.com > The Russian Postgres Company > -- Adrian Klaver adrian.klaver@aklaver.com
Alex Ignatov wrote: > Hello everyone! > Today in Big Data epoch silent data corruption becoming more and more issue > to afraid of. With uncorrectable read error rate ~ 10^-15 on multiterabyte > disk bit rot is the real issue. > I think that today checksumming data must be mandatory set by default. > Only if someone doesn't care about his data he can manually turn this option > off. In principle I support the idea of turning data checksums by default, but can you provide some numbers on how it affects performance on various workloads? That's a critical point in the discussion. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 20.04.2016 16:58, Alvaro Herrera wrote: > Alex Ignatov wrote: >> Hello everyone! >> Today in Big Data epoch silent data corruption becoming more and more issue >> to afraid of. With uncorrectable read error rate ~ 10^-15 on multiterabyte >> disk bit rot is the real issue. >> I think that today checksumming data must be mandatory set by default. >> Only if someone doesn't care about his data he can manually turn this option >> off. > In principle I support the idea of turning data checksums by default, > but can you provide some numbers on how it affects performance on > various workloads? That's a critical point in the discussion. > Right now i am working on this tests. On various media - hdd ssd ram and some various workload. As soon as results be ready i'll provide my numbers -- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Wed, Apr 20, 2016 at 3:43 AM, Alex Ignatov <a.ignatov@postgrespro.ru> wrote:
What do you think about defaulting --data-checksums in initdb?
I think that ZFS storing my database files already does this and can correct for it using replicated copies, so why do I need a second layer of checksums?
On 20.04.2016 23:28, Vick Khera wrote:
Ms Windows doesnt have ZFS support. AIX also doesnt. Z/OS also. Any other commercial Linux distros don't have ZFS support. Yes you can compile it and use on production but...On Wed, Apr 20, 2016 at 3:43 AM, Alex Ignatov <a.ignatov@postgrespro.ru> wrote:What do you think about defaulting --data-checksums in initdb?
I think that ZFS storing my database files already does this and can correct for it using replicated copies, so why do I need a second layer of checksums?
But PG runs on the above OS, but have check sum off by default. Thats the deal. And it is not related to ZFS existence or any other FS with checksums in any way. The question is only in performance hit when you turn it on and now I am in the process of testing it...
-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Thu, Apr 21, 2016 at 9:00 AM, Alex Ignatov <a.ignatov@postgrespro.ru> wrote:
Ms Windows doesnt have ZFS support. AIX also doesnt. Z/OS also. Any other commercial Linux distros don't have ZFS support. Yes you can compile it and use on production but...
But PG runs on the above OS, but have check sum off by default. Thats the deal. And it is not related to ZFS existence or any other FS with checksums in any way. The question is only in performance hit when you turn it on and now I am in the process of testing it...
I don't care about those platforms, so changing the default is just making more work for me. :)
On 20 April 2016 at 14:43, Alex Ignatov <a.ignatov@postgrespro.ru> wrote: > Hello everyone! > Today in Big Data epoch silent data corruption becoming more and more issue > to afraid of. With uncorrectable read error rate ~ 10^-15 on multiterabyte > disk bit rot is the real issue. > I think that today checksumming data must be mandatory set by default. > Only if someone doesn't care about his data he can manually turn this option > off. > > What do you think about defaulting --data-checksums in initdb? I think --data-checksums should default to on. Databases created 'thoughtlessly' should have safe defaults. Operators creating databases with care can elect to disable it if they are redundant in their environment, if they cannot afford the overhead, or consider their data low value enough to not want to pay the overheads. If the performance impact is deemed unacceptable, perhaps the ability to turn them off on an existing database is easily doable (a one way operation). -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/
On 21.04.2016 20:26, Vick Khera wrote:
I see ;). By the way following my tests turn this option on is only 1-2% overhead in throughput.On Thu, Apr 21, 2016 at 9:00 AM, Alex Ignatov <a.ignatov@postgrespro.ru> wrote:Ms Windows doesnt have ZFS support. AIX also doesnt. Z/OS also. Any other commercial Linux distros don't have ZFS support. Yes you can compile it and use on production but...
But PG runs on the above OS, but have check sum off by default. Thats the deal. And it is not related to ZFS existence or any other FS with checksums in any way. The question is only in performance hit when you turn it on and now I am in the process of testing it...
I don't care about those platforms, so changing the default is just making more work for me. :)
-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
> On Apr 22, 2016, at 3:21 AM, Stuart Bishop <stuart@stuartbishop.net> wrote: > > On 20 April 2016 at 14:43, Alex Ignatov <a.ignatov@postgrespro.ru> wrote: >> Hello everyone! >> Today in Big Data epoch silent data corruption becoming more and more issue >> to afraid of. With uncorrectable read error rate ~ 10^-15 on multiterabyte >> disk bit rot is the real issue. >> I think that today checksumming data must be mandatory set by default. >> Only if someone doesn't care about his data he can manually turn this option >> off. >> >> What do you think about defaulting --data-checksums in initdb? > > I think --data-checksums should default to on. > > Databases created 'thoughtlessly' should have safe defaults. Operators > creating databases with care can elect to disable it if they are > redundant in their environment, if they cannot afford the overhead, or > consider their data low value enough to not want to pay the overheads. > > If the performance impact is deemed unacceptable, perhaps the ability > to turn them off on an existing database is easily doable (a one way > operation). > > -- > Stuart Bishop <stuart@stuartbishop.net> > http://www.stuartbishop.net/ > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general +1 Bob Lunney Lead Data Architect MeetMe, Inc.