Re: logical column order and physical column order - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: logical column order and physical column order |
Date | |
Msg-id | CAApHDvo1KyFpneFXXQEF94GMVAd=Z07A0Srh9T8nczMoOqx7UA@mail.gmail.com Whole thread Raw |
In response to | Re: logical column order and physical column order (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Responses |
Re: logical column order and physical column order
|
List | pgsql-hackers |
On Mon, Nov 4, 2013 at 3:14 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
David Rowley escribió:> I've just been looking at how alignment of columns in tuples can make theThis has been discussed at length previously, and there was a design
> tuple larger than needed.
proposed to solve this problem. See these past discussions:
http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php
http://archives.postgresql.org/pgsql-hackers/2007-02/msg01235.php
http://archives.postgresql.org/pgsql-hackers/2008-11/msg01680.php
I started work on this, and managed to get parts of it to work. While
doing so I realized that it was quite a lot more hideous than I had
originally expected. I published a tree at github:
https://github.com/alvherre/postgres/tree/column
Thanks for the archive links... I read these last night and pulled out some key pieces of information from some of the posts.
I should say that I've not dived into the code too much to see how hard it would be, but my, perhaps naive original idea would have just be to add 1 column to pg_attribute to store the logical order and have attnum store the physical order... This would have meant that at least only the following places would have to take into account the change.
1. pg_dump needs to display columns in logical order both for create tables and copy/insert statements.
2. INSERT INTO table values( ... ) (i.e without column names) needs to look at logical order.
3. create table like <table>
4. backup and restore using copy
5. select * expand to column names
And of lesser importance as I'd assume it would just be a change in an ORDER BY clause in their queries of pg_attribute
1. Display in clients... psql Pg Admin
I thought the above would have been doable and I did wonder what all the fuss was about relating to bugs in the code where it could use the logical number instead of attnum.
On reading of the posts last night I can see that the idea was to add not 1 but 2 new fields to pg_attribute. One was for the physical order and one for the logical order and at first I didn't really understand as I thought attnum would always be the physical order. I didn't really know before this that attnum was static... I did some tests were I dropped one of the middle columns out of a table and then rewrote the table with cluster and I see that the pg_attribute record is kept even though the remains of the column values have been wiped out by the rewrite... Is this done because things like indexes, foreign keys and sequences reference the {attrelid,attnum} ? if so then I see why the 2 extra columns are needed and I guess that's where the extra complications come from.
So now I'm wondering, with my freshly clustered table which I dropped one of the middle columns from before the cluster... my pg_attributes look something like:
relname | attname | attnum
---------+------------------------------+--------
dropcol | tableoid | -7
dropcol | cmax | -6
dropcol | xmax | -5
dropcol | cmin | -4
dropcol | xmin | -3
dropcol | ctid | -1
dropcol | one | 1
dropcol | ........pg.dropped.2........ | 2
dropcol | three | 3
and I would imagine since the table has just been clustered that the columns are stored like {..., ctid, one,three}
In this case how does Postgresql know that attnum 3 is the 2nd user column in that table? Unless I have misunderstood something then there must be some logic in there to skip dropped columns and if so then could it not just grab the "attphynum" at that location? then just modify the 1-5 places listed above to sort on attlognum?
Regards
David Rowley
pgsql-hackers by date: