Re: [HACKERS] Happy column dropping - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: [HACKERS] Happy column dropping
Date
Msg-id 388CD59D.74A3C03@tm.ee
Whole thread Raw
In response to Re: [HACKERS] Happy column dropping  (The Hermit Hacker <scrappy@hub.org>)
List pgsql-hackers
The Hermit Hacker wrote:
> 
> > Because the column is still in the table, just invisible after removing
> > from pg_attribute.  You need to remove the column from the heap, and
> > that requires creating a new version of the table.  Vacuum moves tuples
> > but does not make them shorter.
> 
> That I understand ... excuse my ignorance, but what would it take to
> do that?  The way I envision a table 'on disk':
> 
> col1col2col3col4col5|col1col2col3col4col5|col1col2col3col4col5|

I recently did a small python script to salvage deleted tuples and I can say 
that current docs on the layout of storage are very incomplete ant partially 
wrong.

> Basically, you have X tuples per page, where a page is 8192bytes,
> correct?  If you were to remove, let's say, col2 out of the table, why
> can't you do:
> 
> lock table
> read page 1 into memory
> rewrite page1 to disk as:
> col1col3col4col5|col1col3col4col5|col1col3col4col5|
> add one to page and goto 'read page n to memory'
> unlock table
> 
> I'm making an assumption here ... first one being that each 'tuples' has
> some sort of endoftuple marker in the table ...

you do have startoftuple/startoffreespace (as a offset inside the page)
but getting at the starts of col2 and col3 is not that easy - you must use 
all the accessor functions from pg_attribute and count null-bits (and 
shift null-bitmap) if present 

That's why I actually like the idea of just hiding the column (and setting 
it to DEFAULT NULL) - you don't automatically reclaim space, but you don't 
need much any extra space either. And it's very fast.

And as a practical person I like the current implementation too, mainly
because 
it's there and it does not break anything, at least when you don't use it ;)

As it is not a feature anyone would use in scripts very often (except Don
Baccus ;)

> If we're removing a column, the resultant 'page size' from the modified
> page is going to be smaller then the original, so I would think it would
> be a relatively simple thing, considering that its a read/re-write from
> the same part of the 'on disk file' ...
> 
> ... and it wouldn't require 2X the space used by the table ...

But a system crash while doing it would do really bad things, not to mention 
the fact that it bypasses storage manager making future changes to storage 
managers very hard.

VACUUMs bypassing of storage manager is understandable as it is a part of 
storage manager and not a general SQL thing - a garbage-collecting
all-in-memory 
signing-while-working storage manager will not need vacuum, analyse it may
need, 
perhaps.

-----------------
Hannu


pgsql-hackers by date:

Previous
From: Don Baccus
Date:
Subject: Re: [HACKERS] Well, then you keep your darn columns
Next
From: Hannu Krosing
Date:
Subject: Re: [HACKERS] Some notes on optimizer cost estimates