WIP: relation metapages - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | WIP: relation metapages |
Date | |
Msg-id | CA+Tgmoa13Ou22KU5bYT6hwArqH=cXRNEph4qyOfaQ6qqM4JfbQ@mail.gmail.com Whole thread Raw |
Responses |
Re: WIP: relation metapages
|
List | pgsql-hackers |
Here's a WIP patch implementing metapages for all relations, somewhat along lines previously discussed: http://archives.postgresql.org/pgsql-hackers/2012-05/msg00860.php It turns out that doing this for indexes was pretty easy and didn't obviously break anything; doing it for heaps was harder and broke a lot of stuff. If you apply the patch as attached here, you'll find that we fail a whole bunch of regression tests, mostly due to plan changes. It seems that having N+1 pages in the heap changes the optimal way to do... everything. Of course, the extra page need not be included in seq-scans, so you'd think this was mostly a matter of adjusting the costing functions to reduce the number of pages by 1 for costing purposes. However, so far I haven't been able to hack the costing to make the plan changes go away, though, which may be a sign that I've broken something else. I can't seem to make Merge Append work at all, which is maybe a better sign that I've broken something. If you want to see the patch pass regression tests, hack heap_create_storage not to emit a metapage for heaps and all the regression test failures disappear. What I'm really looking for at this stage of the game is feedback on the design decisions I made. The intention here is that it should be possible to read old-format heaps and indexes transparently, but that when we create or rewrite a relation, we add a new-style metapage. For all index types except gist, this is really just a format change for the metapage that already existed: the new data that gets stored for all relation types is added at the beginning of the page, just following the page header, and then the AM-specific stuff is moved further down the page. For GiST, it means adding a metapage that wasn't there before, but that went smoothly too. For some AMs, I had to rejigger the WAL-logging a little; review of those changes would be good. The basic idea is that we don't want to have to try to reconstruct what the metapage should have been during recovery (indeed, we can't) so we just log an image of the page instead. For heaps, I refactored things so that heap_create() is no longer used for indexes. Instead, index_create() calls RelationBuildLocalRelation directly. This required moving a little bit of logic from heap_create() into RelationBuildLocalRelation(), but it seems like it may fit better there anyway. That means that heap_create() can now assume that it's creating a heap and not an index. This refactoring might be worth pulling out of the patch and committing separately, since I think the result is actually simpler and cleaner than what we're doing now; but it's a minor point in any case. I put the new metapage code in src/backend/access/common/metapage.c, but I don't have a lot of confidence that that's the appropriate location for it. Suggestions are appreciated. I am pretty sure that clustering a relation will cause it to end up with the wrong relation ID in its metapage afterwards. Since nothing relies on that information at this point, this shouldn't break anything, but it needs to be fixed eventually. I think the thing I'm most worried about is the plan changes that result from adding heap metapages. Suggestions on what to do about that from a costing perspective would be particularly appreciated. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: