Thread: BETWEEN Node & DROP COLUMN

BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
Hi All,

I have given up working on the BETWEEN node.  It got to the stage where I
realised I was really out of my depth!  Rod Taylor has indicated an interest
in the problem and I have sent him my latest patch, so hopefully he'll be
able to crack it.

So instead, I've taken up with the DROP COLUMN crusade.  It seems that the
following are the jobs that need to be done:

* Add attisdropped to pg_attribute - Looking for takers for this one, otherwise I'll look into it.
* Fill out AlterTableDropColumn - I've done this, with the assumption that attisdropped exists.  It sets
attisdropped to true, drops the column default and renames the column.
(Plus does all other normal ALTER TABLE checks)
* Modify parser and other places to ignore dropped columns - This is also up for grabs.
* Modify psql and pg_dump to handle dropped columns - I've done this.

Once the above is done, we have a working drop column implementation.

* Modify all other interfaces, JDBC, etc. to handle dropped cols. - I think this can be suggested to the relevant
developersonce the above
 
is committed!

* Modify VACUUM to add a RECLAIM option to reduce on disk table size. - This is out of my league, so it's up for grabs

I have approached a couple of people off-list to see if they're interested
in helping, so please post to the list if you intend to work on something.

It has also occurred to me that once drop column exists, users will be able
to change the type of their columns manually (ie. create a new col, update
all values, drop the old col).  So, there is no reason why this new
attisdropped field shouldn't allow us to implement a full ALTER TABLE/SET
TYPE sort of feature - cool huh?

Chris





Re: BETWEEN Node & DROP COLUMN

From
Rod Taylor
Date:
> all values, drop the old col).  So, there is no reason why this new
> attisdropped field shouldn't allow us to implement a full ALTER TABLE/SET
> TYPE sort of feature - cool huh?


I've not looked in a while, but the column rename code did not account
for issues in foreign keys, etc.  Those should be easier to ferret out
soon, but may not be so nice to change yet.

It should also be noted that an ALTER TABLE / SET TYPE implemented with
the above idea with run into the 2x diskspace issue as well as take
quite a while to process.





Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> I've not looked in a while, but the column rename code did not account
> for issues in foreign keys, etc.  Those should be easier to ferret out
> soon, but may not be so nice to change yet.

Which is probably a good reason for us to offer it as an all-in-one command,
rather than expecting them to do it manually...

> It should also be noted that an ALTER TABLE / SET TYPE implemented with
> the above idea with run into the 2x diskspace issue as well as take
> quite a while to process.

I think that if the 'SET TYPE' operation is ever to be rollback-able, it
will need to use 2x diskspace.  If it's overwritten in place, there's no
chance of fallback...  I think that a DBA would choose to use the command
knowing full well what it requires?  Better than not offering them the
choice at all!

Chris






Re: BETWEEN Node & DROP COLUMN

From
Rod Taylor
Date:
> > It should also be noted that an ALTER TABLE / SET TYPE implemented with
> > the above idea with run into the 2x diskspace issue as well as take
> > quite a while to process.
> 
> I think that if the 'SET TYPE' operation is ever to be rollback-able, it
> will need to use 2x diskspace.  If it's overwritten in place, there's no
> chance of fallback...  I think that a DBA would choose to use the command
> knowing full well what it requires?  Better than not offering them the
> choice at all!

True, but if we did the multi-version thing in pg_attribute we may be
able to coerce to the right type on the way out making it a high speed
change.





Re: BETWEEN Node & DROP COLUMN

From
Hannu Krosing
Date:
On Wed, 2002-07-03 at 14:32, Rod Taylor wrote:
> > > It should also be noted that an ALTER TABLE / SET TYPE implemented with
> > > the above idea with run into the 2x diskspace issue as well as take
> > > quite a while to process.
> > 
> > I think that if the 'SET TYPE' operation is ever to be rollback-able, it
> > will need to use 2x diskspace.  If it's overwritten in place, there's no
> > chance of fallback...  I think that a DBA would choose to use the command
> > knowing full well what it requires?  Better than not offering them the
> > choice at all!
> 
> True, but if we did the multi-version thing in pg_attribute we may be
> able to coerce to the right type on the way out making it a high speed
> change.

If I understand you right, i.e. you want to do the conversion at each
select(), then the change is high speed but all subsequent queries using
it will pay a a speed penalty, not to mention added complexity of the
whole thing.

I don't think that making changes quick autweights added  slowness and
complexity - changes are meant to be slow ;)

The real-life analogue to the proposed scenario would be adding one
extra wheel next to each existing one in a car in order to make it
possible to change tyres while driving - while certainly possible nobody
actually does it.

---------------
Hannu







Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> Hi All,
>
> I have given up working on the BETWEEN node.  It got to the stage where I
> realised I was really out of my depth!  Rod Taylor has indicated an interest
> in the problem and I have sent him my latest patch, so hopefully he'll be
> able to crack it.
>
> So instead, I've taken up with the DROP COLUMN crusade.  It seems that the
> following are the jobs that need to be done:

Great crusade!

> * Add attisdropped to pg_attribute
>   - Looking for takers for this one, otherwise I'll look into it.

I can do this for you.  Just let me know when.

> * Fill out AlterTableDropColumn
>   - I've done this, with the assumption that attisdropped exists.  It sets
> attisdropped to true, drops the column default and renames the column.
> (Plus does all other normal ALTER TABLE checks)
> * Modify parser and other places to ignore dropped columns
>   - This is also up for grabs.

As I remember, Hiroshi's drop column changed the attribute number to a
special negative value, which required lots of changes to track.
Keeping the same number and just marking the column as dropped is a big
win.  This does push the coding out the client though.

> * Modify psql and pg_dump to handle dropped columns
>   - I've done this.
> 
> Once the above is done, we have a working drop column implementation.
> 
> * Modify all other interfaces, JDBC, etc. to handle dropped cols.
>   - I think this can be suggested to the relevant developers once the above
> is committed!
> 
> * Modify VACUUM to add a RECLAIM option to reduce on disk table size.
>   - This is out of my league, so it's up for grabs

Will UPDATE on a row set the deleted column to NULL?  If so, the
disk space used by the column would go away over time.  In fact, a
simple:UPDATE tab SET col = col;VACUUM;

would remove the data stored in the deleted column;  no change to VACUUM
needed.

> I have approached a couple of people off-list to see if they're interested
> in helping, so please post to the list if you intend to work on something.
> 
> It has also occurred to me that once drop column exists, users will be able
> to change the type of their columns manually (ie. create a new col, update
> all values, drop the old col).  So, there is no reason why this new
> attisdropped field shouldn't allow us to implement a full ALTER TABLE/SET
> TYPE sort of feature - cool huh?

Yep.
--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Hiroshi Inoue wrote:
> > As I remember, Hiroshi's drop column changed the attribute number to a
> > special negative value, which required lots of changes to track.
> 
> ??? What do you mean by *lots of* ?

Yes, please remind me.  Was your solution renumbering the attno values? 
I think there are fewer cases to fix if we keep the existing attribute
numbering and just mark the column as deleted.  Is this accurate?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
Hiroshi Inoue
Date:

Bruce Momjian wrote:
> 
> Christopher Kings-Lynne wrote:
> > Hi All,
> >
> > I have given up working on the BETWEEN node.  It got to the stage where I
> > realised I was really out of my depth!  Rod Taylor has indicated an interest
> > in the problem and I have sent him my latest patch, so hopefully he'll be
> > able to crack it.
> >
> > So instead, I've taken up with the DROP COLUMN crusade.  It seems that the
> > following are the jobs that need to be done:
> 
> Great crusade!
> 
> > * Add attisdropped to pg_attribute
> >   - Looking for takers for this one, otherwise I'll look into it.
> 
> I can do this for you.  Just let me know when.
> 
> > * Fill out AlterTableDropColumn
> >   - I've done this, with the assumption that attisdropped exists.  It sets
> > attisdropped to true, drops the column default and renames the column.
> > (Plus does all other normal ALTER TABLE checks)
> > * Modify parser and other places to ignore dropped columns
> >   - This is also up for grabs.
> 
> As I remember, Hiroshi's drop column changed the attribute number to a
> special negative value, which required lots of changes to track.

??? What do you mean by *lots of* ?

regards,
Hiroshi Inouehttp://w2422.nsk.ne.jp/~inoue/




Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Hiroshi Inoue wrote:
> Bruce Momjian wrote:
> > 
> > Hiroshi Inoue wrote:
> > > > As I remember, Hiroshi's drop column changed the attribute number to a
> > > > special negative value, which required lots of changes to track.
> > >
> > > ??? What do you mean by *lots of* ?
> > 
> > Yes, please remind me.  Was your solution renumbering the attno values?
> 
> Yes though I don't intend to object to Christopher's proposal.
> 
> > I think there are fewer cases to fix if we keep the existing attribute
> > numbering and just mark the column as deleted.  Is this accurate?
> 
> No. I don't understand why you think so. 

With the isdropped column, you really only need to deal with '*'
expansion in a few places, and prevent the column from being accessed. 
With renumbering, the backend loops that go through the attnos have to
be dealt with.

Is this correct?  I certainly prefer attno renumbering to isdropped
because it allows us to get DROP COLUMN without any client changes, or
at least with fewer because the dropped column has a negative attno.  Is
this accurate?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
Hiroshi Inoue
Date:
Bruce Momjian wrote:
> 
> Hiroshi Inoue wrote:
> > > As I remember, Hiroshi's drop column changed the attribute number to a
> > > special negative value, which required lots of changes to track.
> >
> > ??? What do you mean by *lots of* ?
> 
> Yes, please remind me.  Was your solution renumbering the attno values?

Yes though I don't intend to object to Christopher's proposal.

> I think there are fewer cases to fix if we keep the existing attribute
> numbering and just mark the column as deleted.  Is this accurate?

No. I don't understand why you think so. 

regards,
Hiroshi Inouehttp://w2422.nsk.ne.jp/~inoue/




Re: BETWEEN Node & DROP COLUMN

From
Hiroshi Inoue
Date:

Bruce Momjian wrote:
> 
> Hiroshi Inoue wrote:
> > Bruce Momjian wrote:
> > >
> > > Hiroshi Inoue wrote:
> > > > > As I remember, Hiroshi's drop column changed the attribute number to a
> > > > > special negative value, which required lots of changes to track.
> > > >
> > > > ??? What do you mean by *lots of* ?
> > >
> > > Yes, please remind me.  Was your solution renumbering the attno values?
> >
> > Yes though I don't intend to object to Christopher's proposal.
> >
> > > I think there are fewer cases to fix if we keep the existing attribute
> > > numbering and just mark the column as deleted.  Is this accurate?
> >
> > No. I don't understand why you think so.
> 
> With the isdropped column, you really only need to deal with '*'
> expansion in a few places, and prevent the column from being accessed.
> With renumbering, the backend loops that go through the attnos have to
> be dealt with.

I used the following macro in my trial implementation.#define COLUMN_IS_DROPPED(attribute) ((attribute)->attnum <= 
DROP_COLUMN_OFFSET)
The places where the macro was put are exactly the places
where attisdropped must be checked.

The difference is essentially little. Please don't propagate
a wrong information. 
> Is this correct?  I certainly prefer attno renumbering to isdropped
> because it allows us to get DROP COLUMN without any client changes,

Unfortunately many apps rely on the fact that the attnos are
consecutive starting from 1. It was the main reason why Tom
rejected my trial. Nothing has changed about it.

regards,
Hiroshi Inouehttp://w2422.nsk.ne.jp/~inoue/




Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> > > > > Yes, please remind me.  Was your solution renumbering the
> > attno values?
> > > >
> > > > Yes though I don't intend to object to Christopher's proposal.
> 
> Hiroshi,
> 
> I am thinking of rolling back my CVS to see if there's code from your
> previous test implementation that we can use.  Apart from the DropColumn
> function itself, what other changes did you make?  Did you have
> modifications for '*' expansion in the parser, etc.?

Yes, please review Hiroshi's work.  It is good work.  Can we have an
analysis of Hiroshi's approach vs the isdropped case.

Is it better to renumber the attno or set a column to isdropped.  The
former may be easier on the clients.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> > > > Yes, please remind me.  Was your solution renumbering the
> attno values?
> > >
> > > Yes though I don't intend to object to Christopher's proposal.

Hiroshi,

I am thinking of rolling back my CVS to see if there's code from your
previous test implementation that we can use.  Apart from the DropColumn
function itself, what other changes did you make?  Did you have
modifications for '*' expansion in the parser, etc.?

Chris





Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> > > I am thinking of rolling back my CVS to see if there's code from your
> > > previous test implementation that we can use.  Apart from the DropColumn
> > > function itself, what other changes did you make?  Did you have
> > > modifications for '*' expansion in the parser, etc.?
> >
> > Yes, please review Hiroshi's work.  It is good work.  Can we have an
> > analysis of Hiroshi's approach vs the isdropped case.
> 
> Yes, it is.  I've rolled it back and I'm already incorporating his changes
> to the parser into my patch.  I just have to grep all the source code for
> 'HACK' to find all the changes.  It's all very handy.

Yes.  It should have been accepted long ago, but we were waiting for a
"perfect" solution which we all know now will never come.

> 
> > Is it better to renumber the attno or set a column to isdropped.  The
> > former may be easier on the clients.
> 
> Well, obviously I prefer the attisdropped approach.  I think it's clearer
> and there's less confusion.  As a head developer for phpPgAdmin that's what
> I'd prefer...  Hiroshi obviously prefers his solution, but doesn't object to

OK, can you explain the issues from a server and client perspective,
i.e. renumbering vs isdropped?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> > I am thinking of rolling back my CVS to see if there's code from your
> > previous test implementation that we can use.  Apart from the DropColumn
> > function itself, what other changes did you make?  Did you have
> > modifications for '*' expansion in the parser, etc.?
>
> Yes, please review Hiroshi's work.  It is good work.  Can we have an
> analysis of Hiroshi's approach vs the isdropped case.

Yes, it is.  I've rolled it back and I'm already incorporating his changes
to the parser into my patch.  I just have to grep all the source code for
'HACK' to find all the changes.  It's all very handy.

> Is it better to renumber the attno or set a column to isdropped.  The
> former may be easier on the clients.

Well, obviously I prefer the attisdropped approach.  I think it's clearer
and there's less confusion.  As a head developer for phpPgAdmin that's what
I'd prefer...  Hiroshi obviously prefers his solution, but doesn't object to
mine/Tom's.  I think that with all the schema-related changes that clients
will have to handle in 7.3, we may as well hit them with the dropped column
stuff in the same go, that way there's fewer rounds of clients scrambling to
keep up with the server.

I intend to email every single postgres client I can find and tell them
about the new changes, well before we release 7.3...

Chris





Re: BETWEEN Node & DROP COLUMN

From
Hiroshi Inoue
Date:
Christopher Kings-Lynne wrote:
> 
> > > > > Yes, please remind me.  Was your solution renumbering the
> > attno values?
> > > >
> > > > Yes though I don't intend to object to Christopher's proposal.
> 
> Hiroshi,
> 
> I am thinking of rolling back my CVS to see if there's code from your
> previous test implementation that we can use.  Apart from the DropColumn
> function itself, what other changes did you make?  Did you have
> modifications for '*' expansion in the parser, etc.?

Don't mind my posting.
I'm only correcting a misunderstanding for my work.

regards,
Hiroshi Inouehttp://w2422.nsk.ne.jp/~inoue/




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> > Well, obviously I prefer the attisdropped approach.  I think
> it's clearer
> > and there's less confusion.  As a head developer for phpPgAdmin
> that's what
> > I'd prefer...  Hiroshi obviously prefers his solution, but
> doesn't object to
>
> OK, can you explain the issues from a server and client perspective,
> i.e. renumbering vs isdropped?

Well in the renumbering case, the client needs to know about missing attnos
and it has to know to ignore negative attnos (which it probably does
already).  ie. psql and pg_dump wouldn't have to be modified in that case.

In the isdropped case, the client needs to know to exclude any column with
'attisdropped' set to true.

So in both cases, the client needs to be updated.  I personally prefer the
explicit 'is dropped' as opposed to the implicit 'negative number', but hey.

*sigh* Now I've gone and made an argument for the renumbering case.  I'm
going to have a good look at Hiroshi's old code and see which one is less
complicated, etc.  So far all I've really need to do is redefine Hiroshi's
COLUMN_DROPPED macro.

I'm sure that both methods could be made to handle a 'ALTER TABLE/SET TYPE'
syntax.

Chris





Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> > > Well, obviously I prefer the attisdropped approach.  I think
> > it's clearer
> > > and there's less confusion.  As a head developer for phpPgAdmin
> > that's what
> > > I'd prefer...  Hiroshi obviously prefers his solution, but
> > doesn't object to
> >
> > OK, can you explain the issues from a server and client perspective,
> > i.e. renumbering vs isdropped?
> 
> Well in the renumbering case, the client needs to know about missing attnos
> and it has to know to ignore negative attnos (which it probably does
> already).  ie. psql and pg_dump wouldn't have to be modified in that case.
> 
> In the isdropped case, the client needs to know to exclude any column with
> 'attisdropped' set to true.
> 
> So in both cases, the client needs to be updated.  I personally prefer the
> explicit 'is dropped' as opposed to the implicit 'negative number', but hey.
> 
> *sigh* Now I've gone and made an argument for the renumbering case.  I'm
> going to have a good look at Hiroshi's old code and see which one is less
> complicated, etc.  So far all I've really need to do is redefine Hiroshi's
> COLUMN_DROPPED macro.
> 
> I'm sure that both methods could be made to handle a 'ALTER TABLE/SET TYPE'
> syntax.

Yes!  This is exactly what I would like investigated.  I am embarrassed
to see that we had Hiroshi's patch all this time and never implemented
it.

I think it underscores that we have drifted too far into the code purity
camp and need a little reality check that users have needs and we should
try to meet them if we want to be successful.  How  many DROP COLUMN
gripes have we heard over the years!  Now I am upset.

OK, I calmed down now.  What I would like to know is which DROP COLUMN
method is easier on the server end, and which is easier on the client
end.  If one is easier in both places, let's use that.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> Unfortunately many apps rely on the fact that the attnos are
> consecutive starting from 1. It was the main reason why Tom
> rejected my trial. Nothing has changed about it.

OK, I've been looking at Hiroshi's implementation.  It's basically
semantically equivalent to mine from what I can see so far.  The only
difference really is in how the dropped columns are marked.

I've been ruminating on Hiroshi's statement at the top there.  What was the
reasoning for assuming that 'many apps rely on the fact that the attnos are
consecutive'?  Is that true?  phpPgAdmin doesn't.  In fact, phpPgAdmin won't
require any changes with Hiroshi's implementaiton and will require changes
with mine.

Anyway, an app that relies on consecutive attnos is going to have pain
skipping over attisdropped columns anyway???

In fact, I'm now beginning to think that I should just resurrect Hiroshi's
implementation.  I'm prepared to do that if people like...

Chris





Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> I used the following macro in my trial implementation.
>  #define COLUMN_IS_DROPPED(attribute) ((attribute)->attnum <= 
> DROP_COLUMN_OFFSET)
> The places where the macro was put are exactly the places
> where attisdropped must be checked.

Actually, your trial required column dropped-ness to be checked in
many more places than the proposed approach does.  Since you renumbered
the dropped column, nominal column numbers didn't correspond to physical
order of values in tuples anymore; that meant checking for dropped
columns in many low-level tuple manipulations.

>> Is this correct?  I certainly prefer attno renumbering to isdropped
>> because it allows us to get DROP COLUMN without any client changes,

> Unfortunately many apps rely on the fact that the attnos are
> consecutive starting from 1. It was the main reason why Tom
> rejected my trial. Nothing has changed about it.

I'm still not thrilled about it ... but I don't see a reasonable way
around it, either.  I don't see any good way to do DROP COLUMN
without breaking applications that make such assumptions.  Unless
you have one, we may as well go for the approach that adds the least
complication to the backend.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> > Unfortunately many apps rely on the fact that the attnos are
> > consecutive starting from 1. It was the main reason why Tom
> > rejected my trial. Nothing has changed about it.
> 
> OK, I've been looking at Hiroshi's implementation.  It's basically
> semantically equivalent to mine from what I can see so far.  The only
> difference really is in how the dropped columns are marked.
> 
> I've been ruminating on Hiroshi's statement at the top there.  What was the
> reasoning for assuming that 'many apps rely on the fact that the attnos are
> consecutive'?  Is that true?  phpPgAdmin doesn't.  In fact, phpPgAdmin won't
> require any changes with Hiroshi's implementaiton and will require changes
> with mine.
> 
> Anyway, an app that relies on consecutive attnos is going to have pain
> skipping over attisdropped columns anyway???
> 
> In fact, I'm now beginning to think that I should just resurrect Hiroshi's
> implementation.  I'm prepared to do that if people like...

Well, you have clearly identified that Hiroshi's approach is cleaner for
clients, because most clients don't need any changes.  If the server end
looks equivalent for both approaches, I suggest you get started with
Hiroshi's idea.

When Hiroshi's idea was originally proposed, some didn't like the
uncleanliness of it, and particularly relations that relied on attno
would all have to be adjusted/removed.  We didn't have pg_depend, of
course, so there was this kind of gap in knowing how to remove all
references to the dropped column.

There was also this idea that somehow the fairy software goddess was
going to come down some day and give us a cleaner way to implement DROP
COLUMN.  She still hasn't shown up.  :-)

I just read over TODO.detail/drop and my memory was correct.  It was a
mixure of having no pg_depend coupled with other ideas.  Now that
pg_depend is coming, DROP COLUMN is ripe for a solution.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Tom Lane wrote:
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > I used the following macro in my trial implementation.
> >  #define COLUMN_IS_DROPPED(attribute) ((attribute)->attnum <= 
> > DROP_COLUMN_OFFSET)
> > The places where the macro was put are exactly the places
> > where attisdropped must be checked.
> 
> Actually, your trial required column dropped-ness to be checked in
> many more places than the proposed approach does.  Since you renumbered
> the dropped column, nominal column numbers didn't correspond to physical
> order of values in tuples anymore; that meant checking for dropped
> columns in many low-level tuple manipulations.
> 
> >> Is this correct?  I certainly prefer attno renumbering to isdropped
> >> because it allows us to get DROP COLUMN without any client changes,
> 
> > Unfortunately many apps rely on the fact that the attnos are
> > consecutive starting from 1. It was the main reason why Tom
> > rejected my trial. Nothing has changed about it.
> 
> I'm still not thrilled about it ... but I don't see a reasonable way
> around it, either.  I don't see any good way to do DROP COLUMN
> without breaking applications that make such assumptions.  Unless
> you have one, we may as well go for the approach that adds the least
> complication to the backend.

It may turn out to be a choice of client-cleanliness vs. backend
cleanliness.  Seems Hiroshi already wins for client cleanliness.  We
just need to know how many extra places need to be checked in the
backend.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> It may turn out to be a choice of client-cleanliness vs. backend
> cleanliness.  Seems Hiroshi already wins for client cleanliness.

No, he only breaks even for client cleanliness --- either approach
*will* require changes in clients that look at pg_attribute.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > It may turn out to be a choice of client-cleanliness vs. backend
> > cleanliness.  Seems Hiroshi already wins for client cleanliness.
> 
> No, he only breaks even for client cleanliness --- either approach
> *will* require changes in clients that look at pg_attribute.

Uh, Christopher already indicated three clients, psql, pg_dump, and
another that will not require changes for Hiroshi's approach, but will
require changes for isdropped.  That doesn't seem "break even" to me.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
Thomas Lockhart
Date:
> Well in the renumbering case, the client needs to know about missing attnos
> and it has to know to ignore negative attnos (which it probably does
> already).  ie. psql and pg_dump wouldn't have to be modified in that case.
> In the isdropped case, the client needs to know to exclude any column with
> 'attisdropped' set to true.
> So in both cases, the client needs to be updated.

How about defining a view (or views) which hides these details? Perhaps
a view which is also defined in SQL99 as one of the information_schema
views which we might like to have anyway?
                   - Thomas




Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> No, he only breaks even for client cleanliness --- either approach
>> *will* require changes in clients that look at pg_attribute.

> Uh, Christopher already indicated three clients, psql, pg_dump, and
> another that will not require changes for Hiroshi's approach, but will
> require changes for isdropped.

Oh?  If either psql or pg_dump don't break, it's a mere coincidence,
because they certainly depend on attnum.  (It's also pretty much
irrelevant considering they're both under our control and hence easily
fixed.)

I'm fairly certain that Christopher is mistaken, anyhow.  Check the
manipulations of attribute defaults for a counterexample in pg_dump.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > I used the following macro in my trial implementation.
> >  #define COLUMN_IS_DROPPED(attribute) ((attribute)->attnum <=
> > DROP_COLUMN_OFFSET)
> > The places where the macro was put are exactly the places
> > where attisdropped must be checked.
> 
> Actually, your trial required column dropped-ness to be checked in
> many more places than the proposed approach does.

Have you ever really checked my trial implementation ?

regards,
Hiroshi Inouehttp://w2422.nsk.ne.jp/~inoue/




Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> No, he only breaks even for client cleanliness --- either approach
> >> *will* require changes in clients that look at pg_attribute.
> 
> > Uh, Christopher already indicated three clients, psql, pg_dump, and
> > another that will not require changes for Hiroshi's approach, but will
> > require changes for isdropped.
> 
> Oh?  If either psql or pg_dump don't break, it's a mere coincidence,
> because they certainly depend on attnum.  (It's also pretty much
> irrelevant considering they're both under our control and hence easily
> fixed.)
> 
> I'm fairly certain that Christopher is mistaken, anyhow.  Check the
> manipulations of attribute defaults for a counterexample in pg_dump.

Well, it seems isdropped is going to have to be checked by _any_ client,
while holes in the number will have to be checked by _some_ clients.  Is
that accurate?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> Tom Lane wrote:
>> Actually, your trial required column dropped-ness to be checked in
>> many more places than the proposed approach does.

> Have you ever really checked my trial implementation ?

Well, I've certainly stumbled over it in places like relcache.c
and preptlist.c, which IMHO should not have to know about this...
and I have little confidence that there are not more places that
would have needed fixes if the change had gotten any wide use.
You were essentially assuming that it was okay for pg_attribute.attnum
to not agree with indexes into tuple descriptors, which seems very
shaky to me.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Tom Lane wrote:
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > Tom Lane wrote:
> >> Actually, your trial required column dropped-ness to be checked in
> >> many more places than the proposed approach does.
> 
> > Have you ever really checked my trial implementation ?
> 
> Well, I've certainly stumbled over it in places like relcache.c
> and preptlist.c, which IMHO should not have to know about this...
> and I have little confidence that there are not more places that
> would have needed fixes if the change had gotten any wide use.
> You were essentially assuming that it was okay for pg_attribute.attnum
> to not agree with indexes into tuple descriptors, which seems very
> shaky to me.

Isn't it only the dropped column that doesn't agree with the descriptor.
The kept columns retain the same numbering, and a NULL sits in the
dropped spot, right?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Well, it seems isdropped is going to have to be checked by _any_ client,
> while holes in the number will have to be checked by _some_ clients.  Is
> that accurate?

What's your point?  No client that examines pg_attribute can be trusted
until it's been examined pretty closely (as in, more closely than
Christopher looked at pg_dump).  I'd prefer to see us keep the backend
simple and trustworthy, rather than pursue a largely-illusory idea that
we might be saving some trouble on the client side.  The clients are
less likely to cause unrecoverable data corruption if something is
missed.

If we were willing to remap attnums so that clients would require *no*
changes, it would be worth doing --- but I believe we've already
rejected that approach as unworkable.  I don't think "maybe you don't
need to change, but you'd better study your code very carefully anyway"
is a big selling point.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Well, it seems isdropped is going to have to be checked by _any_ client,
> > while holes in the number will have to be checked by _some_ clients.  Is
> > that accurate?
> 
> What's your point?  No client that examines pg_attribute can be trusted
> until it's been examined pretty closely (as in, more closely than
> Christopher looked at pg_dump).  I'd prefer to see us keep the backend
> simple and trustworthy, rather than pursue a largely-illusory idea that
> we might be saving some trouble on the client side.  The clients are
> less likely to cause unrecoverable data corruption if something is
> missed.
> 
> If we were willing to remap attnums so that clients would require *no*
> changes, it would be worth doing --- but I believe we've already
> rejected that approach as unworkable.  I don't think "maybe you don't
> need to change, but you'd better study your code very carefully anyway"
> is a big selling point.

It sure is.  If most people don't need to modify their code, that is a
win.  Your logic is that we should make everyone modify their code and
somehow that will be more reliable?  No wonder people think we are more
worried about clean code than making things easier for our users.

I will vote for the option that has the less pain for our users _and_ in
the backend, but if it is close, I will prefer to make things easier on
clients/users.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> > What's your point?  No client that examines pg_attribute can be trusted
> > until it's been examined pretty closely (as in, more closely than
> > Christopher looked at pg_dump).  I'd prefer to see us keep the backend
> > simple and trustworthy, rather than pursue a largely-illusory idea that
> > we might be saving some trouble on the client side.  The clients are
> > less likely to cause unrecoverable data corruption if something is
> > missed.

I'm prepared to admit I didn't look at pg_dump too hard.  I have to say that
I agree with Tom here, but that's personal opinion.  If Tom reckons that
there's places where Hiroshi's implementation needed work and that there
would be messiness, then I'm inclined to believe him.

In all honesty, the amount of changes clients have to make to support
schemas makes checking dropped columns pale in significance.

> > If we were willing to remap attnums so that clients would require *no*
> > changes, it would be worth doing --- but I believe we've already
> > rejected that approach as unworkable.  I don't think "maybe you don't
> > need to change, but you'd better study your code very carefully anyway"
> > is a big selling point.

Exactly.  I like the whole 'explicit' idea of having attisdropped.  There's
no ifs and buts.  It's not a case of, "oh, the attnum is negative, but it's
not an arbitratily negative system column" sort of thing.

> I will vote for the option that has the less pain for our users _and_ in
> the backend, but if it is close, I will prefer to make things easier on
> clients/users.

I will vote for attisdropped.  However, I'm not a main developer and I will
go with the flow.  In the meantime, I'm developing attisdropped but using
some of Hiroshi's implementation...

Chris





Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Well, it seems isdropped is going to have to be checked by _any_ client,
> > while holes in the number will have to be checked by _some_ clients.  Is
> > that accurate?
> 
> What's your point?  No client that examines pg_attribute can be trusted
> until it's been examined pretty closely (as in, more closely than
> Christopher looked at pg_dump).  I'd prefer to see us keep the backend
> simple and trustworthy, rather than pursue a largely-illusory idea that
> we might be saving some trouble on the client side.

Largely-illusory?  Almost every pg_attribute query will have to be modified
for isdropped, while Hiroshi's approach requires so few changes, we are
having trouble even finding a query that needs to be modified.  That's
pretty clear to me.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
By the way,

What happens if someone drops ALL the columns in a table?  Do you just leave
it there as-is without any columns or should it be forbidden or should it be
interpreted as a drop table?

Chris





Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> OK, I've been looking at Hiroshi's implementation.  It's basically
> semantically equivalent to mine from what I can see so far.  The only
> difference really is in how the dropped columns are marked.

True enough, but that's not a trivial difference.  The problem with
Hiroshi's implementation is that there's no longer a close tie between
pg_attribute.attnum and physical positions of datums in tuples.  I think
that that's going to affect a lot of low-level code, and that he hasn't
found all of it.

Keeping the attisdropped marker separate from attnum is logically
cleaner, and IMHO much less likely to lead to trouble down the road.

We should not allow ourselves to put too much weight on the fact that
some clients use "attnum > 0" as a filter for attributes that they
(think they) need not pay attention to.  That's only a historical
artifact, and it's far from clear that it will keep those clients
out of trouble anyway.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> What happens if someone drops ALL the columns in a table?

Good point.  Ideally we should allow that, but in practice I suspect
there are many places that will blow up on zero-length tuples.
Rejecting the situation might be the better part of valor ... anyway
I'm not excited about spending a lot of time searching for such bugs.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> > OK, I've been looking at Hiroshi's implementation.  It's basically
> > semantically equivalent to mine from what I can see so far.  The only
> > difference really is in how the dropped columns are marked.
> 
> True enough, but that's not a trivial difference.  The problem with
> Hiroshi's implementation is that there's no longer a close tie between
> pg_attribute.attnum and physical positions of datums in tuples.  I think
> that that's going to affect a lot of low-level code, and that he hasn't
> found all of it.

Isn't that only for the dropped column.  Don't the remaining columns stay
logically clear as far as the tuple storage is concerned?

> 
> Keeping the attisdropped marker separate from attnum is logically
> cleaner, and IMHO much less likely to lead to trouble down the road.

My problem is that you are pushing the DROP COLUMN check out into almost
every client that uses pg_attribute.  And we are doing this to keep our
backend cleaner.  Seems we should do the work once, in the backend, and
not burden clients will all of this.

> We should not allow ourselves to put too much weight on the fact that
> some clients use "attnum > 0" as a filter for attributes that they
> (think they) need not pay attention to.  That's only a historical
> artifact, and it's far from clear that it will keep those clients
> out of trouble anyway.

Well, why shouldn't we use the fact that most/all clients don't look at
attno < 0, and that we have no intention of changing that requirement. 
We aren't coding in a vacuum.  We have clients, they do that already,
let's use it.

Attno < 0 is not historical.  It is in the current code, and will remain
so for the forseeable future, I think.

I honestly don't understand the priorities we are setting here.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> True enough, but that's not a trivial difference.  The problem with
> Hiroshi's implementation is that there's no longer a close tie between
> pg_attribute.attnum and physical positions of datums in tuples.  I think
> that that's going to affect a lot of low-level code, and that he hasn't
> found all of it.

OK, I can see how that would be a problem actually.  You'd have to regard
attnum as a 'virtual attnum' and keep having to reverse the computation to
figure out what its original attnum was...

> Keeping the attisdropped marker separate from attnum is logically
> cleaner, and IMHO much less likely to lead to trouble down the road.

I'm a purist and I like to think that good, clean, well thought out code
always results in more stable, bug free software.

> We should not allow ourselves to put too much weight on the fact that
> some clients use "attnum > 0" as a filter for attributes that they
> (think they) need not pay attention to.  That's only a historical
> artifact, and it's far from clear that it will keep those clients
> out of trouble anyway.

It's also not 'every client app' that will need to be altered.  Just DB
admin apps, of which there aren't really that many.  And remember, anyone
who uses the catalogs directly always does so at their own risk.  I think
that once we have a proper INFORMATION_SCHEMA anyway, all clients should use
that.  Heck, if INFORMATION_SCHEMA gets in in 7.3, then clients might have a
_heap_ of work to do...

Chris





Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> > No, he only breaks even for client cleanliness --- either approach
> > *will* require changes in clients that look at pg_attribute.
> 
> Uh, Christopher already indicated three clients, psql, pg_dump, and
> another that will not require changes for Hiroshi's approach, but will
> require changes for isdropped.  That doesn't seem "break even" to me.

And Tom pointed out that I was wrong...

Chris





Re: BETWEEN Node & DROP COLUMN

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > Tom Lane wrote:
> >> Actually, your trial required column dropped-ness to be checked in
> >> many more places than the proposed approach does.
> 
> > Have you ever really checked my trial implementation ?
> 
> Well, I've certainly stumbled over it in places like relcache.c
> and preptlist.c, which IMHO should not have to know about this...
> and I have little confidence that there are not more places that
> would have needed fixes if the change had gotten any wide use.
> You were essentially assuming that it was okay for pg_attribute.attnum
> to not agree with indexes into tuple descriptors, which seems very
> shaky to me.

I already explained it to you once in the thread Re: [HACKERS]
RFC: Restructuring pg_aggregate. How many times should I
explain the same thing ?
My trial implementation is essentially the same as adding
isdropped pg_attribute column. There's no strangeness in
my implementation.
The reason why I adopted negative attnos is as follows.
I also explained it more than twice.

1) It doesn't need initdb. It was very conveneient for  the TRIAL implementation.
2) It's more sensitive about oversights of modification  than isdropped column implementation.

regards,
Hiroshi Inouehttp://w2422.nsk.ne.jp/~inoue/




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> My problem is that you are pushing the DROP COLUMN check out into almost
> every client that uses pg_attribute.  And we are doing this to keep our
> backend cleaner.  Seems we should do the work once, in the backend, and
> not burden clients will all of this.

As a user of Postgres, I found the following more painful:

* Anti-varchar truncation in 7.2
* Making you have to quote "timestamp"(), etc.

People mail the list every day with backwards compatibility problems.  We've
done it before, why not do it again?  In fact, I'm sure there are already
backwards compatibility problems in 7.3.

> > We should not allow ourselves to put too much weight on the fact that
> > some clients use "attnum > 0" as a filter for attributes that they
> > (think they) need not pay attention to.  That's only a historical
> > artifact, and it's far from clear that it will keep those clients
> > out of trouble anyway.
>
> Well, why shouldn't we use the fact that most/all clients don't look at
> attno < 0, and that we have no intention of changing that requirement.
> We aren't coding in a vacuum.  We have clients, they do that already,
> let's use it.
>
> Attno < 0 is not historical.  It is in the current code, and will remain
> so for the forseeable future, I think.

Problem is, the current code actually assumes that attno < 0 means that the
attribute is a system column, NOT a dropped user column.

As an example, I'd have to change all of these in the Postgres source code:
       /* Prevent them from altering a system attribute */       if (attnum < 0)               elog(ERROR, "ALTER
TABLE:Cannot alter system attribute
 
\"%s\"",                        colName);

Who knows how many other things like this are littered through the source?

Chris





Re: BETWEEN Node & DROP COLUMN

From
"Dave Page"
Date:

> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
> Sent: 04 July 2002 07:40
> To: Tom Lane
> Cc: Christopher Kings-Lynne; Hiroshi Inoue; Hackers
> Subject: Re: [HACKERS] BETWEEN Node & DROP COLUMN
>
>
> Well, why shouldn't we use the fact that most/all clients
> don't look at attno < 0, and that we have no intention of
> changing that requirement.
> We aren't coding in a vacuum.  We have clients, they do that
> already, let's use it.

Just to chuck my $0.02 in the pot:

pgAdmin will require modification not matter which route is taken. It
*does* look at columns with negative attnums whenever the user switches
on the 'View System Objects' option which un-hides the pg_*
tables/views, columns with attnums < 1, template1 and more.

From my pov, the least painful route would be to add the attisdropped
column. I can add a check to this far more easily than messing about
with losing columns where attnum < -7 - especially, if in a future
release of PostgreSQL the number of columns like tableoid, xid etc
changes.

Personnally, from a not caring about how it works, just how it's
presented perspective, attisdropped seems much cleaner to me.

I also agree with Christopher - compared to the work the addition of
schemas required (~50 hours in pgAdmin) this is a 2 minute job!

Well, that was more like $0.10....

Regards, Dave.




Re: BETWEEN Node & DROP COLUMN

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> > OK, I've been looking at Hiroshi's implementation.  It's basically
> > semantically equivalent to mine from what I can see so far.  The only
> > difference really is in how the dropped columns are marked.
> 
> True enough, but that's not a trivial difference.

> The problem with
> Hiroshi's implementation is that there's no longer a close tie between
> pg_attribute.attnum and physical positions of datums in tuples. 

?? Where does the above consideration come from ?

BTW there seems a misunderstanding about my posting.
I'm not objecting to add attisdropped pg_attribute column.
They are essentially the same and so I used macros
like COLUMN_IS_DROPPED in my implementation so that
I can easily change the implementation to use isdropped
pg_attribute column.
I'm only correcting the unfair valuation for my
trial work.

regards,
Hiroshi Inouehttp://w2422.nsk.ne.jp/~inoue/




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> BTW there seems a misunderstanding about my posting.
> I'm not objecting to add attisdropped pg_attribute column.
> They are essentially the same and so I used macros
> like COLUMN_IS_DROPPED in my implementation so that
> I can easily change the implementation to use isdropped
> pg_attribute column.
> I'm only correcting the unfair valuation for my
> trial work.

Hiroshi, I totally respect your trial work.  In fact, I'm relying on it to
do the attisdropped implementation.  I think everyone's beginning to get a
bit cranky here - I think we should all just calm down.

Chris





Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Thomas Lockhart wrote:
> > Well in the renumbering case, the client needs to know about missing attnos
> > and it has to know to ignore negative attnos (which it probably does
> > already).  ie. psql and pg_dump wouldn't have to be modified in that case.
> > In the isdropped case, the client needs to know to exclude any column with
> > 'attisdropped' set to true.
> > So in both cases, the client needs to be updated.
> 
> How about defining a view (or views) which hides these details? Perhaps
> a view which is also defined in SQL99 as one of the information_schema
> views which we might like to have anyway?

We could change pg_attribute to another name, and create a view called
pg_attribute that never returned isdropped columns to the client.  That
would allow clients to work cleanly, and the server to work cleanly.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Largely-illusory?  Almost every pg_attribute query will have to be modified
> for isdropped, while Hiroshi's approach requires so few changes, we are
> having trouble even finding a query that needs to be modified.  That's
> pretty clear to me.

Apparently you didn't think hard about the pg_dump example.  The problem
there isn't the query so much as it is the wired-in assumption that the
retrieved rows will correspond to attnums 1-N in sequence.  That
assumption breaks either way we do it.  The illusion is thinking that
clients won't break.

I suspect it will actually be easier to fix pg_dump if we use the
attisdropped approach --- it could keep the assumption that its array
indexes equal attnums, include attisdropped explicitly in the rows
it stores, and just not output rows that have attisdropped true.
Getting rid of the index == attnum assumption will be a considerably
more subtle, and fragile, patch.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
Rod Taylor
Date:
> We could change pg_attribute to another name, and create a view called
> pg_attribute that never returned isdropped columns to the client.  That
> would allow clients to work cleanly, and the server to work cleanly.

Another case where having an informational schema would eliminate the
whole argument -- as the clients wouldn't need to touch the system
tables.

Any thoughts on that initial commit Peter?





Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Rod Taylor wrote:
> > We could change pg_attribute to another name, and create a view called
> > pg_attribute that never returned isdropped columns to the client.  That
> > would allow clients to work cleanly, and the server to work cleanly.
> 
> Another case where having an informational schema would eliminate the
> whole argument -- as the clients wouldn't need to touch the system
> tables.
> 
> Any thoughts on that initial commit Peter?

From my new understanding, the client coders _want_ to see the isdropped
row so the attno's are consecutive.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
Rod Taylor <rbt@zort.ca> writes:
>> We could change pg_attribute to another name, and create a view called
>> pg_attribute that never returned isdropped columns to the client.  That
>> would allow clients to work cleanly, and the server to work cleanly.

> Another case where having an informational schema would eliminate the
> whole argument -- as the clients wouldn't need to touch the system
> tables.

This is a long-term solution, not a near-term one.  I suspect it's
really unlikely that pg_dump, pgAdmin, etc will ever want to switch
over to the SQL-standard informational schema, because they will want
to be able to look at Postgres-specific features that are not reflected
in the standardized schema.  Certainly there will be no movement in
that direction until the informational schema is complete; a first-cut
implementation won't attract any interest at all :-(

I thought about the idea of a backward-compatible pg_attribute view,
but I don't see any efficient way to generate the consecutively-numbered
attnum column in a view; anyone?
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
Tom Lane wrote:
> Rod Taylor <rbt@zort.ca> writes:
> >> We could change pg_attribute to another name, and create a view called
> >> pg_attribute that never returned isdropped columns to the client.  That
> >> would allow clients to work cleanly, and the server to work cleanly.
> 
> > Another case where having an informational schema would eliminate the
> > whole argument -- as the clients wouldn't need to touch the system
> > tables.
> 
> This is a long-term solution, not a near-term one.  I suspect it's
> really unlikely that pg_dump, pgAdmin, etc will ever want to switch
> over to the SQL-standard informational schema, because they will want
> to be able to look at Postgres-specific features that are not reflected
> in the standardized schema.  Certainly there will be no movement in
> that direction until the informational schema is complete; a first-cut
> implementation won't attract any interest at all :-(
> 
> I thought about the idea of a backward-compatible pg_attribute view,
> but I don't see any efficient way to generate the consecutively-numbered
> attnum column in a view; anyone?

No, we can't, and because our client coders want consecutive, it is a
dead idea.  Even if we could do it, we would be feeding clients attno
values that are inaccurate, causing problems when attno is joined to
other tables.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Even if we could do it, we would be feeding clients attno
> values that are inaccurate, causing problems when attno is joined to
> other tables.

Good point; we'd need similar views replacing pg_attrdef and probably
other places.  Messy indeed :-(

But as Dave already pointed out, it's probably pointless to worry.
The schema support in 7.3 will already de-facto break nearly every
client that inspects the system catalogs, so adding some more work
for DROP COLUMN support isn't going to make much difference.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> Christopher, if you are still having trouble adding the isdropped system
> column, please let me know.

Thanks Bruce, but I think I've got it sorted now.  One weird thing is that
although I added it as being false in pg_attribute.h, I get these tuples
having attisdropped set to true by initdb.

Are these from the toasting process and maybe the stats or something??

Chris
attrelid |      attname
----------+-------------------   16464 | chunk_id   16464 | chunk_seq   16464 | chunk_data   16466 | chunk_id   16466 |
chunk_seq  16467 | chunk_id   16467 | chunk_seq   16467 | chunk_data   16469 | chunk_id   16469 | chunk_seq   16470 |
chunk_id  16470 | chunk_seq   16470 | chunk_data   16472 | chunk_id   16472 | chunk_seq   16473 | chunk_id   16473 |
chunk_seq  16473 | chunk_data   16475 | chunk_id   16475 | chunk_seq   16476 | chunk_id   16476 | chunk_seq   16476 |
chunk_data  16478 | chunk_id   16478 | chunk_seq   16479 | chunk_id   16479 | chunk_seq   16479 | chunk_data   16481 |
chunk_id  16481 | chunk_seq   16482 | chunk_id   16482 | chunk_seq   16482 | chunk_data   16484 | chunk_id   16484 |
chunk_seq  16485 | chunk_id   16485 | chunk_seq   16485 | chunk_data   16487 | chunk_id   16487 | chunk_seq   16488 |
chunk_id  16488 | chunk_seq   16488 | chunk_data   16490 | chunk_id   16490 | chunk_seq   16491 | usecreatedb   16491 |
usesuper  16491 | passwd   16491 | valuntil   16491 | useconfig   16494 | schemaname   16494 | tablename   16494 |
rulename  16494 | definition   16498 | schemaname   16498 | viewname   16498 | viewowner   16498 | definition   16501 |
tablename  16501 | tableowner   16501 | hasindexes   16501 | hasrules   16501 | hastriggers   16504 | tablename   16504
|indexname   16504 | indexdef   16507 | tablename   16507 | attname   16507 | null_frac   16507 | avg_width   16507 |
n_distinct  16507 | most_common_vals   16507 | most_common_freqs   16507 | histogram_bounds   16507 | correlation
16511| relid   16511 | relname   16511 | seq_scan   16511 | seq_tup_read   16511 | idx_scan   16511 | idx_tup_fetch
16511| n_tup_ins   16511 | n_tup_upd   16511 | n_tup_del   16514 | relid   16514 | relname   16514 | heap_blks_read
16514| heap_blks_hit   16514 | idx_blks_read   16514 | idx_blks_hit   16514 | toast_blks_read   16514 | toast_blks_hit
16514 | tidx_blks_read   16514 | tidx_blks_hit   16518 | relid   16518 | indexrelid   16518 | relname   16518 |
indexrelname  16518 | idx_scan   16518 | idx_tup_read   16518 | idx_tup_fetch   16521 | relid   16521 | indexrelid
16521| relname   16521 | indexrelname   16521 | idx_blks_read   16521 | idx_blks_hit   16524 | relid   16524 | relname
16524 | blks_read   16524 | blks_hit   16527 | datid   16527 | datname   16527 | procpid   16527 | usesysid   16527 |
usename  16527 | current_query   16530 | datid   16530 | datname   16530 | numbackends   16530 | xact_commit   16530 |
xact_rollback  16530 | blks_read   16530 | blks_hit
 





Re: BETWEEN Node & DROP COLUMN

From
Bruce Momjian
Date:
The problem is that the new column is now part of pg_attribute so every
catalog/pg_attribute.h DATA() line has to be updated.  Did you update
them all with 'false' in the right slot?  Not sure what the chunks are.


---------------------------------------------------------------------------

Christopher Kings-Lynne wrote:
> > Christopher, if you are still having trouble adding the isdropped system
> > column, please let me know.
> 
> Thanks Bruce, but I think I've got it sorted now.  One weird thing is that
> although I added it as being false in pg_attribute.h, I get these tuples
> having attisdropped set to true by initdb.
> 
> Are these from the toasting process and maybe the stats or something??
> 
> Chris
> 
>  attrelid |      attname
> ----------+-------------------
>     16464 | chunk_id
>     16464 | chunk_seq
>     16464 | chunk_data
>     16466 | chunk_id
>     16466 | chunk_seq
>     16467 | chunk_id
>     16467 | chunk_seq
>     16467 | chunk_data
>     16469 | chunk_id
>     16469 | chunk_seq
>     16470 | chunk_id
>     16470 | chunk_seq
>     16470 | chunk_data
>     16472 | chunk_id
>     16472 | chunk_seq
>     16473 | chunk_id
>     16473 | chunk_seq
>     16473 | chunk_data
>     16475 | chunk_id
>     16475 | chunk_seq
>     16476 | chunk_id
>     16476 | chunk_seq
>     16476 | chunk_data
>     16478 | chunk_id
>     16478 | chunk_seq
>     16479 | chunk_id
>     16479 | chunk_seq
>     16479 | chunk_data
>     16481 | chunk_id
>     16481 | chunk_seq
>     16482 | chunk_id
>     16482 | chunk_seq
>     16482 | chunk_data
>     16484 | chunk_id
>     16484 | chunk_seq
>     16485 | chunk_id
>     16485 | chunk_seq
>     16485 | chunk_data
>     16487 | chunk_id
>     16487 | chunk_seq
>     16488 | chunk_id
>     16488 | chunk_seq
>     16488 | chunk_data
>     16490 | chunk_id
>     16490 | chunk_seq
>     16491 | usecreatedb
>     16491 | usesuper
>     16491 | passwd
>     16491 | valuntil
>     16491 | useconfig
>     16494 | schemaname
>     16494 | tablename
>     16494 | rulename
>     16494 | definition
>     16498 | schemaname
>     16498 | viewname
>     16498 | viewowner
>     16498 | definition
>     16501 | tablename
>     16501 | tableowner
>     16501 | hasindexes
>     16501 | hasrules
>     16501 | hastriggers
>     16504 | tablename
>     16504 | indexname
>     16504 | indexdef
>     16507 | tablename
>     16507 | attname
>     16507 | null_frac
>     16507 | avg_width
>     16507 | n_distinct
>     16507 | most_common_vals
>     16507 | most_common_freqs
>     16507 | histogram_bounds
>     16507 | correlation
>     16511 | relid
>     16511 | relname
>     16511 | seq_scan
>     16511 | seq_tup_read
>     16511 | idx_scan
>     16511 | idx_tup_fetch
>     16511 | n_tup_ins
>     16511 | n_tup_upd
>     16511 | n_tup_del
>     16514 | relid
>     16514 | relname
>     16514 | heap_blks_read
>     16514 | heap_blks_hit
>     16514 | idx_blks_read
>     16514 | idx_blks_hit
>     16514 | toast_blks_read
>     16514 | toast_blks_hit
>     16514 | tidx_blks_read
>     16514 | tidx_blks_hit
>     16518 | relid
>     16518 | indexrelid
>     16518 | relname
>     16518 | indexrelname
>     16518 | idx_scan
>     16518 | idx_tup_read
>     16518 | idx_tup_fetch
>     16521 | relid
>     16521 | indexrelid
>     16521 | relname
>     16521 | indexrelname
>     16521 | idx_blks_read
>     16521 | idx_blks_hit
>     16524 | relid
>     16524 | relname
>     16524 | blks_read
>     16524 | blks_hit
>     16527 | datid
>     16527 | datname
>     16527 | procpid
>     16527 | usesysid
>     16527 | usename
>     16527 | current_query
>     16530 | datid
>     16530 | datname
>     16530 | numbackends
>     16530 | xact_commit
>     16530 | xact_rollback
>     16530 | blks_read
>     16530 | blks_hit
> 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 
> 
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> > We could change pg_attribute to another name, and create a view called
> > pg_attribute that never returned isdropped columns to the client.  That
> > would allow clients to work cleanly, and the server to work cleanly.
>
> Another case where having an informational schema would eliminate the
> whole argument -- as the clients wouldn't need to touch the system
> tables.

Since postgres has so many features that standard SQL doesn't have (eg.
custom operators), how are they going to be shown in the information schema?

Chris





Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> Thanks Bruce, but I think I've got it sorted now.  One weird thing is that
> although I added it as being false in pg_attribute.h, I get these tuples
> having attisdropped set to true by initdb.

It sounds to me like you've failed to make sure that the field is
initialized properly when a pg_attribute row is dynamically created.
Let's see... did you fix the static FormData_pg_attribute rows near
the top of heap.c?  Does TupleDescInitEntry() know about initializing
the field?  (I wonder why it doesn't memset() the whole row to zero
anyway...)

pg_attribute is very possibly the most ticklish system catalog
to add a column to.  I'd suggest looking through every single use of
some other pg_attribute column, perhaps attstattarget or attnotnull,
to make sure you're initializing attisdropped everywhere it should be.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> The problem is that the new column is now part of pg_attribute so every
> catalog/pg_attribute.h DATA() line has to be updated.  Did you update
> them all with 'false' in the right slot?  Not sure what the chunks are.

Yep - I did that, I think the problem's more subtle.

Chris





Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> It sounds to me like you've failed to make sure that the field is
> initialized properly when a pg_attribute row is dynamically created.
> Let's see... did you fix the static FormData_pg_attribute rows near
> the top of heap.c?  Does TupleDescInitEntry() know about initializing
> the field?  (I wonder why it doesn't memset() the whole row to zero
> anyway...)

OK I'll look at them.

> pg_attribute is very possibly the most ticklish system catalog
> to add a column to.  I'd suggest looking through every single use of
> some other pg_attribute column, perhaps attstattarget or attnotnull,
> to make sure you're initializing attisdropped everywhere it should be.

OK, I'm on the case.

Chris





Re: BETWEEN Node & DROP COLUMN

From
Rod Taylor
Date:
On Thu, 2002-07-04 at 22:07, Christopher Kings-Lynne wrote:
> > > We could change pg_attribute to another name, and create a view called
> > > pg_attribute that never returned isdropped columns to the client.  That
> > > would allow clients to work cleanly, and the server to work cleanly.
> >
> > Another case where having an informational schema would eliminate the
> > whole argument -- as the clients wouldn't need to touch the system
> > tables.
> 
> Since postgres has so many features that standard SQL doesn't have (eg.
> custom operators), how are they going to be shown in the information schema?

I would assume we would add pg_TABLE or TABLES.pg_COLUMN as appropriate
and where it wouldn't disturbe normal usage.

If we always put pg columns at the end it shouldn't disturbe programs
which use vectors to pull information out of the DB with a target of *.





Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> > pg_attribute is very possibly the most ticklish system catalog
> > to add a column to.  I'd suggest looking through every single use of
> > some other pg_attribute column, perhaps attstattarget or attnotnull,
> > to make sure you're initializing attisdropped everywhere it should be.

Done.

Wow - I've almost finished it now, actually!  It's at the stage where
everything works as expected, all the initdb attributes are properly marked
not dropped, the drop column command works fine and psql works fine.  All
regression tests also pass.  '*' expansion works properly.

I have a lot of testing to go, however.  I will make up regression tests as
well.

Some questions:

1. I'm going to prevent people from dropping the last column in their table.
I think this is the safest option.  How do I check if there's any other non
dropped columns in a table?  Reference code anywhere?

2. What should I do about inheritance?  I'm going to implement it, but are
there issues?  It will basically drop the column with the same name in all
child tables.  Is that correct behaviour?

3. I am going to initially implement the patch to ignore the behaviour and
do no dependency checking.  I will assume that Rod's patch will handle that
without much trouble.

Thanks,

Chris





Re: BETWEEN Node & DROP COLUMN

From
Alvaro Herrera
Date:
Christopher Kings-Lynne dijo: 

I have a question:

> 2. What should I do about inheritance?  I'm going to implement it, but are
> there issues?  It will basically drop the column with the same name in all
> child tables.  Is that correct behaviour?

What happens if I drop an inherited column in a child table?  Maybe it
works, but what happens when I SELECT the column in the parent table?

-- 
Alvaro Herrera (<alvherre[a]atentus.com>)
"Granting software the freedom to evolve guarantees only different results,
not better ones." (Zygo Blaxell)





Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> > 2. What should I do about inheritance?  I'm going to implement 
> it, but are
> > there issues?  It will basically drop the column with the same 
> name in all
> > child tables.  Is that correct behaviour?
> 
> What happens if I drop an inherited column in a child table?  Maybe it
> works, but what happens when I SELECT the column in the parent table?

I don't know?  Tom?

Well, what happens if you rename a column in a child table?  Same problem?

Chris





Re: BETWEEN Node & DROP COLUMN

From
Alvaro Herrera
Date:
Christopher Kings-Lynne dijo: 

> > > 2. What should I do about inheritance?  I'm going to implement 
> > it, but are
> > > there issues?  It will basically drop the column with the same 
> > name in all
> > > child tables.  Is that correct behaviour?
> > 
> > What happens if I drop an inherited column in a child table?  Maybe it
> > works, but what happens when I SELECT the column in the parent table?
> 
> I don't know?  Tom?
> 
> Well, what happens if you rename a column in a child table?  Same problem?

It merrily renames the column in the child table (I tried it).  When
SELECTing the parent, bogus data appears.  Looks like a bug to me.
Maybe the ALTER TABLE ...  RENAME COLUMN code should check for inherited
columns before renaming them.

-- 
Alvaro Herrera (<alvherre[a]atentus.com>)
One man's impedance mismatch is another man's layer of abstraction.
(Lincoln Yeoh)






Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> > Well, what happens if you rename a column in a child table?  
> Same problem?
> 
> It merrily renames the column in the child table (I tried it).  When
> SELECTing the parent, bogus data appears.  Looks like a bug to me.
> Maybe the ALTER TABLE ...  RENAME COLUMN code should check for inherited
> columns before renaming them.

Hmmm...so how does one check if one is a child in an inheritance hierarchy?

Chris





Re: BETWEEN Node & DROP COLUMN

From
"Christopher Kings-Lynne"
Date:
> > It merrily renames the column in the child table (I tried it).  When
> > SELECTing the parent, bogus data appears.  Looks like a bug to me.
> > Maybe the ALTER TABLE ...  RENAME COLUMN code should check for inherited
> > columns before renaming them.
>
> Hmmm...so how does one check if one is a child in an inheritance
> hierarchy?

Actually, more specifically, how does one check that the column being
dropped or renamed appears in none of one's parent tables?

I notice there's no find_all_ancestors() function...

Chris





Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
>> What happens if I drop an inherited column in a child table?  Maybe it
>> works, but what happens when I SELECT the column in the parent table?

> Well, what happens if you rename a column in a child table?  Same problem?

Ideally we should disallow both of those, as well as cases like
changing the column type.

It might be that we can use the pg_depend stuff to enforce this (by
setting up dependency links from child to parent).  However that would
introduce a ton of overhead in a regular DROP TABLE, and you'd still
need specialized code to prevent the RENAME case (pg_depend wouldn't
care about that).  I'm thinking that it's worth adding an attisinherited
column to pg_attribute to make these rules easy to enforce.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
Tom Lane
Date:
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> 1. I'm going to prevent people from dropping the last column in their table.
> I think this is the safest option.  How do I check if there's any other non
> dropped columns in a table?  Reference code anywhere?

You look through the Relation's tupledesc and make sure there's at least
one other non-dropped column.

> 2. What should I do about inheritance?  I'm going to implement it, but are
> there issues?  It will basically drop the column with the same name in all
> child tables.  Is that correct behaviour?

Yes, if the 'inh' flag is set.

If 'inh' is not set, then the right thing would be to drop the parent's
column and mark all the *first level* children's columns as
not-inherited.  How painful that would be depends on what representation
we choose for marking inherited columns, if any.

> 3. I am going to initially implement the patch to ignore the behaviour and
> do no dependency checking.  I will assume that Rod's patch will handle that
> without much trouble.

Yeah, Rod was looking ahead to DROP COLUMN.  I'm still working on his
patch (mostly the pg_constraint side) but should have it soon.
        regards, tom lane




Re: BETWEEN Node & DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> 
> "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> > OK, I've been looking at Hiroshi's implementation.  It's basically
> > semantically equivalent to mine from what I can see so far.  The only
> > difference really is in how the dropped columns are marked.
> 
> True enough, but that's not a trivial difference.  The problem with
> Hiroshi's implementation is that there's no longer a close tie between
> pg_attribute.attnum and physical positions of datums in tuples.  I think
> that that's going to affect a lot of low-level code, and that he hasn't
> found all of it.

Please don't propagate an unfair valuation without any verification.

regards,
Hiroshi Inoue