Re: Checksums, state of play - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Checksums, state of play
Date
Msg-id CA+TgmoZ==ZAZpTLRx-54FB9gd87QxGKOuXnHhAE2P8Wg-DoPvw@mail.gmail.com
Whole thread Raw
In response to Re: Checksums, state of play  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Checksums, state of play  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On Tue, Mar 6, 2012 at 10:40 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Tue, Mar 6, 2012 at 2:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> For the reasons stated above, I believe pd_tli is less useful than
>> pd_pagesize_version.  I fear that if we repurpose pd_pagesize_version,
>> we're going to make things very difficult for people who want to write
>> block-inspection tools, like pg_filedump or pageinspect.  Right now,
>> it's possible to look at that offset within the block and know for
>> certain what page version you're dealing with.  If we repurpose it to
>> hold checksum data, that will no longer be possible.  Unlike pd_tli,
>> pd_pagesize_version is validated every time we read a block.
>
> We've not changed the page format in 5 years. I really can't see what
> the value of having a constant stored on every data block, especially
> since you're now saying that we *shouldn't* bump the constant for this
> change. Surely if we are keeping the pd_pagesize_version field its
> obvious that we should increment it? If not, why the insistence on
> keeping the field if we aren't using it for its stated purpose?
>
> Do you know of any PostgreSQL variant that can set this byte range to
> different values? If so, I'd suggest we just declare the field "user
> defined" or some such so that others can use it for different things
> as well and then use pd_tli.
>
> IMHO if we keep use pd_tli but pd_pagesize_version then we should increment it.

The fact that we haven't changed the page format in 5 years is a good
thing, and I hope that we won't change it very often because it will
require whole-cluster rewrites to take full advantage of whatever
features are made available by the version bump, which is darn
painful.  However, I'm pretty sure that eventually we're going to want
to bump it.  Aside from checksums, the most imminent thing I can think
of that might cause us to do that is this idea regarding XID
wraparound:

http://archives.postgresql.org/message-id/4F2FA541.8040300@enterprisedb.com

However, even if we don't do that or we find some way to do it without
bumping the page version, it seems likely to me that something else
will come up eventually.  The size of our page header doesn't thrill
me, but the one byte we've allocated to storing the version is only a
minor contributor and pretty cheap insurance against future needs.

As to whether we should increment pd_pagesize_version, I'm not sure
quite what you were saying about that (I think you may have an extra
or missing word there), but I don't think it's necessary here.  I
believe we feel free to assign new flag bits without bumping the page
size version, so we could define PD_HAS_CHECKSUM without doing so.
Maybe your point is that we're changing the meaning of pd_tli and it
seems ugly to do that without the bumping the page version, but I
guess my point is that we're not changing it incompatibly.  We really
only need to bump the page version for changes where a newer version
of the server would otherwise misinterpret an older page, which isn't
a problem in this case because pd_tli is basically dead already.

And, on a more practical level, Tom argued on the other thread that if
we have a page upgrade facility, then we ought to store the minimum
page version for every relation in a pg_class column, so we can keep
track of when all pages of the older format are gone.  That's
infrastructure that this patch doesn't really need, and we can avoid
having to build it by steering clear of the page versioning issue
altogether, viewing this instead as an enhancement of the existing
page format that doesn't break compatibility with older releases.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: elegant and effective way for running jobs inside a database
Next
From: Simon Riggs
Date:
Subject: Re: Checksums, state of play