logical column position - Mailing list pgsql-hackers

From Neil Conway
Subject logical column position
Date
Msg-id 87wu9zndq6.fsf@mailbox.samurai.com
Whole thread Raw
Responses Re: logical column position  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: logical column position  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I'd like to add a new column to pg_attribute that specifies the
attribute's "logical position" within its relation. The idea here is
to separate the logical order of the columns in a relation from the
on-disk storage of the relation's tuples. This allows us to easily &
quickly change column order, add an additional column before or after
an existing column, etc.

At present, attnum basically does three things: identifies an column
within a relation, indicates which columns are system columns, and
defines the order of a relation's columns. I'd like to move this last
functionality into a separate pg_attribute column named "attpos" (or
"attlogicalpos"):
        - when the table is created, attnum == attpos. System columns          have attpos < 0, as with attnum. At no
pointwill two          columns of the same relation have the same attpos.
 
        - when returning output to the client and no column ordering          is implied by the query (e.g. "SELECT *
..."),we sort the          columns in ascending attpos order.
 
        - when storing a tuple on disk, we don't consider attpos
        - if we want to change the order of the column's in a          relation, we can do so merely by updating
pg_attribute;no          changes to the on-disk storage of the relation should be          necessary
 

A few notes:
 (a) ISTM this should also apply to COPY TO and COPY FROM if the user     didn't supply a column list. Is this
reasonable?It would break     dumps of the table's contents, but then again, dumps aren't     guaranteed to remain
validover arbitrary changes to the table's     meta-data.
 
 (b) Using the above scheme that attnum == attpos initially, there     won't be any gaps in the sequence of attpos
values.That means     that if, for example, we want to move the column in position 50     to position 1, we'll need to
changethe position's of all the     columns in positions [1..49] (and suffer the resulting MVCC     bloat in
pg_attribute).Changing the column order is hardly a     performance critical operation, so that might be acceptable.
 
     If we want to avoid this, one easy (but arguably unclean) way to     do so would be to make the initial value of
attpos== attnum *     1000, and make attpos an int4 rather than an int2. Then, we can     do most column reordering
operationswith only a single     pg_attribute update -- in the worst-case that enough     re-orderings are done that we
overflowthe 999 "padding"     positions, we can just fall-back to doing multiple pg_attribute     updates. Is this
worthdoing, and/or is there a better way to     achieve the same effect?
 
 (c) Do I need to consider inheritance?

Comments are welcome.

-Neil



pgsql-hackers by date:

Previous
From: Adam Witney
Date:
Subject: 7.4 make failure on OSX
Next
From: Neil Conway
Date:
Subject: logical column position