Re: Column storage positions - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Column storage positions
Date
Msg-id 1172080790.3874.35.camel@silverbirch.site
Whole thread Raw
In response to Re: Column storage positions  ("Phil Currier" <pcurrier@gmail.com>)
Responses Re: Column storage positions  (Andrew Dunstan <andrew@dunslane.net>)
Re: Column storage positions  ("Florian G. Pflug" <fgp@phlo.org>)
List pgsql-hackers
On Wed, 2007-02-21 at 09:25 -0500, Phil Currier wrote:
> On 2/21/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:
> > I'd expect the system being able to reoder the columns to the most
> > efficient order possible (performance-wise and padding-saving-wise),
> > automatically.  When you create a table, sort the columns to the most
> > efficient order; ALTER TABLE ADD COLUMN just puts the new columns at the
> > end of the tuple; and anything that requires a rewrite of the table
> > (ALTER TABLE ... ALTER TYPE for example; would be cool to have CLUSTER
> > do it as well; and do it on TRUNCATE also) again recomputes the most
> > efficient order.
> 
> That's exactly what I'm proposing.  On table creation, the system
> chooses an efficient column order for you. 

That's fairly straightforward and beneficial. I much prefer Alvaro's
approach rather than the storage position details originally described.
Moreover, you'd need to significantly re-write lots of ALTER TABLE and I
really don't think you want to go there.

There is a problem: If people do a CREATE TABLE and then issue SELECT *
they will find the columns in a different order. That could actually
break some programs, so it isn't acceptable in all cases. e.g. COPY
without a column-list assumes that the incoming data should be assigned
to the table columns in the same order as the incoming data file.

So if we do this, it should be controllable using a GUC: optimize_column_order = off (default) | on
This should be a USERSET, so different users can create tables in either
full control or optimised mode, as they choose.

It should be possible to do that with the minimum number of position
swaps, so that people who have ordered the columns according to usage
frequency would still get what they wanted.

>  The next time an ALTER
> TABLE operation forces a rewrite, the system would recompute the
> column storage order.  I hadn't thought of having CLUSTER also redo
> the storage order, but that seems safe since it takes an exclusive
> lock on the table.  I'm less sure about whether it's safe to do this
> during a TRUNCATE.

The GUC should apply to whenever/wherever this optimization occurs.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




pgsql-hackers by date:

Previous
From: "Simon Riggs"
Date:
Subject: Re: HOT WIP Patch - version 2
Next
From: "Simon Riggs"
Date:
Subject: Re: HOT for PostgreSQL 8.3