Thread: FAQ (disk space)

FAQ (disk space)

From
Clarence Gardner
Date:
>> Could you please tell me if there's a limit in the amount of records that
>> the database can handle?

>See the FAQ: http://www.postgresql.org/docs/faqs/FAQ.html#4.5

Speaking of the FAQ, in the next question (#4.6), should
"NULLs are stored in bitmaps" say "NULL indicators are stored in bitmaps"?


Re: FAQ (disk space)

From
Bruce Momjian
Date:
Clarence Gardner wrote:
> >> Could you please tell me if there's a limit in the amount of records that
> >> the database can handle?
>
> >See the FAQ: http://www.postgresql.org/docs/faqs/FAQ.html#4.5
>
> Speaking of the FAQ, in the next question (#4.6), should
> "NULLs are stored in bitmaps" say "NULL indicators are stored in bitmaps"?

I changed it to:

    NULLs are stored _as_ bitmaps

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: FAQ (disk space)

From
Einar Indridason
Date:
On Sat, Jan 24, 2004 at 09:29:12PM -0500, Bruce Momjian wrote:
> Clarence Gardner wrote:
> > >> Could you please tell me if there's a limit in the amount of records that
> > >> the database can handle?
> >
> > >See the FAQ: http://www.postgresql.org/docs/faqs/FAQ.html#4.5
> >
> > Speaking of the FAQ, in the next question (#4.6), should
> > "NULLs are stored in bitmaps" say "NULL indicators are stored in bitmaps"?
>
> I changed it to:
>
>     NULLs are stored _as_ bitmaps

Eh... good morning folks.  I have been lurking on the postgreSQL lists
for a while.  Now when I read this, a question arise.

Does postgres calculate some sort of a "checksum" over a row?

I mean... storing whether a field is NULL or not inside a single bit,
seems slightly risky (especially when we consider how much of the
hardware out there, is of marginal quality).

Would it be worth it to calculate some sort of a checksum over a row, and
store that checksum along with the row?

Cheers,
--
Einar Indridason
einari@f-prot.com

Re: FAQ (disk space)

From
Martijn van Oosterhout
Date:
On Mon, Jan 26, 2004 at 10:48:55AM +0000, Einar Indridason wrote:
> On Sat, Jan 24, 2004 at 09:29:12PM -0500, Bruce Momjian wrote:
> > I changed it to:
> >
> >     NULLs are stored _as_ bitmaps
>
> Eh... good morning folks.  I have been lurking on the postgreSQL lists
> for a while.  Now when I read this, a question arise.
>
> Does postgres calculate some sort of a "checksum" over a row?

No.

> I mean... storing whether a field is NULL or not inside a single bit,
> seems slightly risky (especially when we consider how much of the
> hardware out there, is of marginal quality).

The visibility status of a row is also stored in a single bit. So single bit
errors may cause rows to become visible or invisible. Bit errors in length
fields will render a whole row unreadable. A single bit error in the page
header can make the entire page unreadable. This is not something you can
sensebly protect against.

> Would it be worth it to calculate some sort of a checksum over a row, and
> store that checksum along with the row?

There has been discussion about checksumming entire pages and AFAIK they are
in the WAL, I just don't think they in the main data store.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> (... have gone from d-i being barely usable even by its developers
> anywhere, to being about 20% done. Sweet. And the last 80% usually takes
> 20% of the time, too, right?) -- Anthony Towns, debian-devel-announce

Attachment

Re: FAQ (disk space)

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Mon, Jan 26, 2004 at 10:48:55AM +0000, Einar Indridason wrote:
>> Would it be worth it to calculate some sort of a checksum over a row, and
>> store that checksum along with the row?

> There has been discussion about checksumming entire pages and AFAIK they are
> in the WAL, I just don't think they in the main data store.

This has been discussed, as Martijn says, and I believe the consensus
was that the benefits wouldn't exceed the costs.  Note that a checksum
does not magically prevent errors, it just means that you will detect
errors and refuse to access potentially-corrupt data.  In practice there
are other ways to detect errors (eg, cross-checking of page header
fields) that seem to get the job done for us.  Also, even when you do
have a corrupted page, having the system refuse to touch it at all is
not necessarily the behavior you want --- you're going to want to see
what data you can extract.

            regards, tom lane

Re: FAQ (disk space)

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> This has been discussed, as Martijn says, and I believe the consensus
> was that the benefits wouldn't exceed the costs.  Note that a checksum
> does not magically prevent errors, it just means that you will detect
> errors and refuse to access potentially-corrupt data.

Well there are ECC codes that allow correcting errors as well. But I don't see
how that would help. You would have to check on every single memory access
since it's likely memory that will cause single bit errors, not disk. Disk is
more likely to give entire bad blocks.

I think the moral is that if you are afraid of single bit errors corrupting
data then you should probably spec out a server with ECC ram.

--
greg