Thread: psql and Control-C

psql and Control-C

From
Peter Eisentraut
Date:
Some people have indicated that they don't like how psql currently handles
Control-C if no query is in progress. I consider the behaviour of the
shells desirable but, quite frankly, I don't know how to do it.

For some reason a readline'd session always wants me to press one more key
after Control-C before getting back to a clean prompt. A much bigger
problem is that if I don't use/have readline then I don't see a way to
preempt the fgets() call.

So unless someone has a hint or wants to look at it, I could offer
ignoring the signal altogether in interactive mode, and perhaps make it
stop scripts in the other case. (Leaving the query cancelling as is, of
course.)

Actually, shouldn't a Ctrl-C in a script cancel the query *and* stop the
script at all times?

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: [HACKERS] psql and Control-C

From
Bruce Momjian
Date:
Charset ISO-8859-1 unsupported, filtering to ASCII...]
> Some people have indicated that they don't like how psql currently handles
> Control-C if no query is in progress. I consider the behaviour of the
> shells desirable but, quite frankly, I don't know how to do it.
>
> For some reason a readline'd session always wants me to press one more key
> after Control-C before getting back to a clean prompt. A much bigger
> problem is that if I don't use/have readline then I don't see a way to
> preempt the fgets() call.
>
> So unless someone has a hint or wants to look at it, I could offer
> ignoring the signal altogether in interactive mode, and perhaps make it
> stop scripts in the other case. (Leaving the query cancelling as is, of
> course.)
>
> Actually, shouldn't a Ctrl-C in a script cancel the query *and* stop the
> script at all times?

Seems we can just ignore ^C if a query is not being run.  Is that OK
with everyone.  Looks easy to do.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [HACKERS] psql and Control-C

From
Chris Bitmead
Date:
Peter Eisentraut wrote:
> 
> Some people have indicated that they don't like how psql currently handles
> Control-C if no query is in progress. I consider the behaviour of the
> shells desirable but, quite frankly, I don't know how to do it.

The typical way to do this sort of thing is to longjmp back to the main
loop. And I think if you look at sig.c in bash, this is probably what
they are doing.

> Actually, shouldn't a Ctrl-C in a script cancel the query *and* stop the
> script at all times?

Yes.


Re: [HACKERS] psql and Control-C

From
Chris Bitmead
Date:
> Seems we can just ignore ^C if a query is not being run.  Is that OK
> with everyone.  Looks easy to do.

It would be a trap for new users (some old ones too) who may not know
how to escape. longjmp should be easy too, if it works.


Re: [HACKERS] psql and Control-C

From
Alfred Perlstein
Date:
* Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> [000217 16:20] wrote:
> Peter Eisentraut wrote:
> > 
> > Some people have indicated that they don't like how psql currently handles
> > Control-C if no query is in progress. I consider the behaviour of the
> > shells desirable but, quite frankly, I don't know how to do it.
> 
> The typical way to do this sort of thing is to longjmp back to the main
> loop. And I think if you look at sig.c in bash, this is probably what
> they are doing.
> 
> > Actually, shouldn't a Ctrl-C in a script cancel the query *and* stop the
> > script at all times?
> 
> Yes.

Whoa whoa... It's a bit more complicated than you think, there's a lot
of state that gets put into libpq, i guess the simplest way would be
to do so and also cancel the transaction, but a simple longjump won't
work reliably and you'd also have to take very careful steps to make
sure you handle everything _just right_ from a signal context.

I'd rather have the inconvience of psql exiting than a not entirely 
thought out mechanism for doing this properly potentially having psql
run amok on my database. :)

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]


Re: [HACKERS] psql and Control-C

From
Bruce Momjian
Date:
> 
> > Seems we can just ignore ^C if a query is not being run.  Is that OK
> > with everyone.  Looks easy to do.
> 
> It would be a trap for new users (some old ones too) who may not know
> how to escape. longjmp should be easy too, if it works.

If they don't know ^D exits, they really are going to have trouble with
Unix.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] psql and Control-C

From
Peter Eisentraut
Date:
On 2000-02-18, Chris Bitmead mentioned:

> The typical way to do this sort of thing is to longjmp back to the main
> loop. And I think if you look at sig.c in bash, this is probably what
> they are doing.

Don't wanna look at that GPL'd code ... ;)

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: [HACKERS] psql and Control-C

From
Chris Bitmead
Date:
Bruce Momjian wrote:
> 
> >
> > > Seems we can just ignore ^C if a query is not being run.  Is that OK
> > > with everyone.  Looks easy to do.
> >
> > It would be a trap for new users (some old ones too) who may not know
> > how to escape. longjmp should be easy too, if it works.
> 
> If they don't know ^D exits, they really are going to have trouble with
> Unix.

I mean escape from a half-typed in query, not escape from psql
altogether.


Re: [HACKERS] psql and Control-C

From
Chris Bitmead
Date:
Peter Eisentraut wrote:
> 
> On 2000-02-18, Chris Bitmead mentioned:
> 
> > The typical way to do this sort of thing is to longjmp back to the main
> > loop. And I think if you look at sig.c in bash, this is probably what
> > they are doing.
> 
> Don't wanna look at that GPL'd code ... ;)

If you don't know how to interact with readline, I think you're gonna
have to look at some GPL code ;)


Re: [HACKERS] psql and Control-C

From
Chris Bitmead
Date:
Alfred Perlstein wrote:

> Whoa whoa... It's a bit more complicated than you think, there's a lot
> of state that gets put into libpq,

I don't think this has anything to do with libpq. This has got to do
with
psql's reading of commands _before_ they get shoved into libpq. As such
it shouldn't be that dangerous.

> i guess the simplest way would be
> to do so and also cancel the transaction, but a simple longjump won't
> work reliably and you'd also have to take very careful steps to make
> sure you handle everything _just right_ from a signal context.
> 
> I'd rather have the inconvience of psql exiting than a not entirely
> thought out mechanism for doing this properly potentially having psql
> run amok on my database. :)
> 
> --
> -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
> 
> ************


Re: [HACKERS] psql and Control-C

From
Alfred Perlstein
Date:
* Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> [000217 17:56] wrote:
> Peter Eisentraut wrote:
> > 
> > On 2000-02-18, Chris Bitmead mentioned:
> > 
> > > The typical way to do this sort of thing is to longjmp back to the main
> > > loop. And I think if you look at sig.c in bash, this is probably what
> > > they are doing.
> > 
> > Don't wanna look at that GPL'd code ... ;)
> 
> If you don't know how to interact with readline, I think you're gonna
> have to look at some GPL code ;)

Actually, FreeBSD(*) has 'libedit' which is pretty nice, it could
be made to work under other systems and has a nice unencumbered
license. :)

-Alfred

(*)* Copyright (c) 1992, 1993*      The Regents of the University of California.  All rights reserved.** This code is
derivedfrom software contributed to Berkeley by* Christos Zoulas of Cornell University.
 
(*)


Re: [HACKERS] psql and Control-C

From
Tom Lane
Date:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> Alfred Perlstein wrote:
>> Whoa whoa... It's a bit more complicated than you think, there's a lot
>> of state that gets put into libpq,

> I don't think this has anything to do with libpq. This has got to do
> with psql's reading of commands _before_ they get shoved into
> libpq. As such it shouldn't be that dangerous.

Chris is right that this is not a libpq issue.  psql would be set up
so that the signal-catching routine either issues a cancel request
(if a query is in progress) or attempts a longjmp (if not).  If
properly implemented, there is zero chance of screwing up libpq.
However, there is some chance of screwing up libreadline --- I don't
know enough about its innards to know if it can survive losing
control at a random point.  If we can confine the region where longjmp
will be attempted to just the point where the program is blocked
waiting for user input, it'd probably be pretty safe.

Something I've noticed that might or might not be related to this
issue is that if psql exits due to backend crash, it fails to save the
last few lines of command history into the history file.  Not closing
down libreadline, maybe?
        regards, tom lane


Re: [HACKERS] psql and Control-C

From
Chris Bitmead
Date:
Alfred Perlstein wrote:
>
> Actually, FreeBSD(*) has 'libedit' which is pretty nice, it could
> be made to work under other systems and has a nice unencumbered
> license. :)

As an option - fine, but most things these days use readline, and
I would want my ~/.inputrc to work with all apps the same way.


Re: [HACKERS] psql and Control-C

From
Chris Bitmead
Date:
Tom Lane wrote:

> Something I've noticed that might or might not be related to this
> issue is that if psql exits due to backend crash, it fails to save the
> last few lines of command history into the history file.  Not closing
> down libreadline, maybe?

This is actually the gnu history library. Currently psql issues a save
command when you exit or when you enter \s. I'm not sure why gnu
history appears to require you to write the whole history at once
rather than appending to a file but anyway...

A better way of doing it might be to have psql's finishInput() passed
to atexit(), just to make sure it's always called, and have
signal handlers in place for *every* signal whose handlers call exit(),
so finishing up is done neatly.

It might even be worthwhile to write the history after every command.

BTW, is it necessary to print "\q" when you hit ctrl-d ? Seems just a
little tacky to me.


Re: [HACKERS] psql and Control-C

From
Peter Eisentraut
Date:
On Fri, 18 Feb 2000, Chris Bitmead wrote:

> BTW, is it necessary to print "\q" when you hit ctrl-d ? Seems just a
> little tacky to me.

That's similar to what bash and tcsh do. If you don't like it, I'm not
married to it.

-- 
Peter Eisentraut                  Sernanders vaeg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: [HACKERS] psql and Control-C

From
Peter Eisentraut
Date:
On Fri, 18 Feb 2000, Tom Lane wrote:

> However, there is some chance of screwing up libreadline --- I don't
> know enough about its innards to know if it can survive losing
> control at a random point.  If we can confine the region where longjmp
> will be attempted to just the point where the program is blocked
> waiting for user input, it'd probably be pretty safe.

Readline has an official way to preempt is, namely setting rl_done to
non-zero. I can take a look how it copes with a longjmp from a signal
handler, but I wouldn't set my hopes too high.

> Something I've noticed that might or might not be related to this
> issue is that if psql exits due to backend crash, it fails to save the
> last few lines of command history into the history file.  Not closing
> down libreadline, maybe?

As someone else pointed out, I might as well move write_history() into an
atexit hook.

-- 
Peter Eisentraut                  Sernanders vaeg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: [HACKERS] psql and Control-C

From
Peter Eisentraut
Date:
On Thu, 17 Feb 2000, Alfred Perlstein wrote:

> Actually, FreeBSD(*) has 'libedit' which is pretty nice, it could
> be made to work under other systems and has a nice unencumbered
> license. :)

Someone else mentioned this to me when I started on this. However, there's
not even a common version among the *BSD's, let alone ports to other
platforms, so I don't see this happening anytime soon.

-- 
Peter Eisentraut                  Sernanders vaeg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: [HACKERS] psql and Control-C

From
Peter Eisentraut
Date:
On Fri, 18 Feb 2000, Chris Bitmead wrote:

> If you don't know how to interact with readline, I think you're gonna
> have to look at some GPL code ;)

There's a difference between copying ideas and code from bash's sig.c (not
good) and snooping around in code to see how to interface with it (why
not).

-- 
Peter Eisentraut                  Sernanders vaeg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: [HACKERS] psql and Control-C

From
Peter Eisentraut
Date:
On Fri, 18 Feb 2000, Chris Bitmead wrote:

> I mean escape from a half-typed in query, not escape from psql
> altogether.

If you don't read the documentation before running a program, I can't help
you. Also, the welcome message points out both \? and \q.

-- 
Peter Eisentraut                  Sernanders vaeg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: [HACKERS] psql and Control-C

From
Tom Lane
Date:
Peter Eisentraut <e99re41@DoCS.UU.SE> writes:
> On Fri, 18 Feb 2000, Tom Lane wrote:
>> However, there is some chance of screwing up libreadline --- I don't
>> know enough about its innards to know if it can survive losing
>> control at a random point.  If we can confine the region where longjmp
>> will be attempted to just the point where the program is blocked
>> waiting for user input, it'd probably be pretty safe.

> Readline has an official way to preempt is, namely setting rl_done to
> non-zero. I can take a look how it copes with a longjmp from a signal
> handler, but I wouldn't set my hopes too high.

Oh?  Maybe we don't *need* a longjmp: maybe the signal handler just
needs to do either send-a-cancel or set-rl_done depending on the
current state of a flag that's set by the main line code.
        regards, tom lane


Re: [HACKERS] psql and Control-C

From
Peter Eisentraut
Date:
On 2000-02-18, Tom Lane mentioned:

> > Readline has an official way to preempt is, namely setting rl_done to
> > non-zero. I can take a look how it copes with a longjmp from a signal
> > handler, but I wouldn't set my hopes too high.
> 
> Oh?  Maybe we don't *need* a longjmp: maybe the signal handler just
> needs to do either send-a-cancel or set-rl_done depending on the
> current state of a flag that's set by the main line code.

I tried that but it doesn't work. On further thought I believe that the
purpose of rl_done is for readline extensions, so that, for example, a
semicolon handler can scan the current line and then immediately return as
if you had pressed enter. When idle, readline hangs on read(), so setting
some variable isn't going to interrupt that.

The longjmp seems to work but I need to test it more. I'm concerned how it
will work across platforms, esp. Windows (being a POSIX thing). Should
there be a configure test or can I assume it on every non-WIN32 system?

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden




Re: [HACKERS] psql and Control-C

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> The longjmp seems to work but I need to test it more. I'm concerned how it
> will work across platforms, esp. Windows (being a POSIX thing). Should
> there be a configure test or can I assume it on every non-WIN32 system?

longjmp predates POSIX by an eon or two.  I doubt you need to worry about
it on Unix platforms.  (Since we utterly rely on it in the backend,
Postgres wouldn't ever work on a platform without it anyway.)

Less sure about the situation on Windows or Mac, but configure isn't
going to help for those anyway.
        regards, tom lane


Re: [HACKERS] psql and Control-C

From
Chris Bitmead
Date:
Peter Eisentraut wrote:
> 
> On Fri, 18 Feb 2000, Chris Bitmead wrote:
> 
> > I mean escape from a half-typed in query, not escape from psql
> > altogether.
> 
> If you don't read the documentation before running a program, I can't help
> you. Also, the welcome message points out both \? and \q.

I think it takes quite a while to realise that you can intersperse a
backslash command anywhere. like..

select * from
\q

which is a bit different to shell where
while :
exit

will not have the desired effect.


ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
Marc called me today to discuss ALTER TABLE DROP COLUMN options.

Our new idea is to do the ALTER TABLE DROP COLUMN in place in the
existing table, rather than make a new one and try and preserve all the
table attributes.

You can exclusively lock the table, then do a heap_getnext() scan over
the entire table, remove the dropped column, do a heap_insert(), then a
heap_delete() on the current tuple, making sure to skip over the tuples
inserted by the current transaction.  When completed, remove the column
from pg_attribute, mark the transaction as committed (if desired), and
run vacuum over the table to remove the deleted rows.

Seems this would be a very clean implementation for 7.1.  It also would
be roll-backable in cases where the operation failed half-way during the
process.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> You can exclusively lock the table, then do a heap_getnext() scan over
> the entire table, remove the dropped column, do a heap_insert(), then a
> heap_delete() on the current tuple, making sure to skip over the tuples
> inserted by the current transaction.  When completed, remove the column
> from pg_attribute, mark the transaction as committed (if desired), and
> run vacuum over the table to remove the deleted rows.

Hmm, that would work --- the new tuples commit at the same instant that
the schema updates commit, so it should be correct.  You have the 2x
disk usage problem, but there's no way around that without losing
rollback ability.

A potentially tricky bit will be persuading the tuple-reading and tuple-
writing subroutines to pay attention to different versions of the tuple
structure for the same table.  I haven't looked to see if this will be
difficult or not.  If you can pass the TupleDesc explicitly then it
shouldn't be a problem.

I'd suggest that the cleanup vacuum *not* be an automatic part of
the operation; just recommend that people do it ASAP after dropping
a column.  Consider needing to drop several columns...
        regards, tom lane


Re: [HACKERS] ALTER TABLE DROP COLUMN

From
Hannu Krosing
Date:
Tom Lane wrote:
> 
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > You can exclusively lock the table, then do a heap_getnext() scan over
> > the entire table, remove the dropped column, do a heap_insert(), then a
> > heap_delete() on the current tuple, making sure to skip over the tuples
> > inserted by the current transaction.  When completed, remove the column
> > from pg_attribute, mark the transaction as committed (if desired), and
> > run vacuum over the table to remove the deleted rows.
> 
> Hmm, that would work --- the new tuples commit at the same instant that
> the schema updates commit, so it should be correct.  You have the 2x
> disk usage problem, but there's no way around that without losing
> rollback ability.
> 
> A potentially tricky bit will be persuading the tuple-reading and tuple-
> writing subroutines to pay attention to different versions of the tuple
> structure for the same table.  I haven't looked to see if this will be
> difficult or not.  If you can pass the TupleDesc explicitly then it
> shouldn't be a problem.
> 
> I'd suggest that the cleanup vacuum *not* be an automatic part of
> the operation; just recommend that people do it ASAP after dropping
> a column.  Consider needing to drop several columns...

Does SQL92 syntax allow dropping several columns, i.e.

ALTER TABLE mytable DROP COLUMN col1,col5,col6;

If it does, it would be very desirable to implement it to avoid the need 
for vacuum between each DROP in order to have _only_ 2X disk usage.

-----------
Hannu


Re: [HACKERS] ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 08:06 PM 2/27/00 +0200, Hannu Krosing wrote:

>Does SQL92 syntax allow dropping several columns, i.e.
>
>ALTER TABLE mytable DROP COLUMN col1,col5,col6;

My reading of the syntax says no, it is not allowed.

>If it does, it would be very desirable to implement it to avoid the need 
>for vacuum between each DROP in order to have _only_ 2X disk usage.

However, implementing useful extensions to the standard in an
upward-compatible way doesn't bother me.

I'm not fond of language implementations that are full of gratuitous
extensions, but when extensions address real shortcomings in a standard
or intersect with a particular implementation in a useful way, then
it makes sense to add them.  In this case, you're asking for an
extension that's useful because Postgres doesn't reclaim storage when
a tuple's deleted, but only when the table's vacuumed.  Seems fair
enough.

Whether or not it would be hard to implement is another matter...



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: ALTER TABLE DROP COLUMN

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> You can exclusively lock the table, then do a heap_getnext() scan over
> the entire table, remove the dropped column, do a heap_insert(), then a
> heap_delete() on the current tuple,

Wow, that almost seems to easy to be true. I never thought that having
tuples of different structures in the table at the same time would be
possible. If so then I don't see a reason why this would be too hard to
do.


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden




Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
[Charset ISO-8859-1 unsupported, filtering to ASCII...]
> Bruce Momjian writes:
> 
> > You can exclusively lock the table, then do a heap_getnext() scan over
> > the entire table, remove the dropped column, do a heap_insert(), then a
> > heap_delete() on the current tuple,
> 
> Wow, that almost seems to easy to be true. I never thought that having
> tuples of different structures in the table at the same time would be
> possible. If so then I don't see a reason why this would be too hard to
> do.

If the transaction is not committed, I don't think anything actually
reads the tuple columns, so you are safe.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: owner-pgsql-hackers@postgreSQL.org
> [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Bruce Momjian
>
> [Charset ISO-8859-1 unsupported, filtering to ASCII...]
> > Bruce Momjian writes:
> >
> > > You can exclusively lock the table, then do a heap_getnext() scan over
> > > the entire table, remove the dropped column, do a
> heap_insert(), then a
> > > heap_delete() on the current tuple,
> >
> > Wow, that almost seems to easy to be true. I never thought that having
> > tuples of different structures in the table at the same time would be
> > possible. If so then I don't see a reason why this would be too hard to
> > do.
>
> If the transaction is not committed, I don't think anything actually
> reads the tuple columns, so you are safe.
>

Hmm,tuples of multiple version in a table ?
This is neither clean nor easy for me.
There's no such stuff which takes the case into account,AFAIK.

Seems no one but me object to it. I'm tired of this issue and it's
painful for me to continue discussion further in my poor English.
I may be able to provide another implementation on trial and it
may be easier than only objecting to your proposal.
Is it OK ?

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp



Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> Hmm,tuples of multiple version in a table ?
> This is neither clean nor easy for me.

I'm worried about it too.  I think it could maybe be made to work,
but it seems fragile.

> I may be able to provide another implementation on trial and it
> may be easier than only objecting to your proposal.

If you have a better idea, let's hear it!
        regards, tom lane


Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
> > > Wow, that almost seems to easy to be true. I never thought that having
> > > tuples of different structures in the table at the same time would be
> > > possible. If so then I don't see a reason why this would be too hard to
> > > do.
> >
> > If the transaction is not committed, I don't think anything actually
> > reads the tuple columns, so you are safe.
> >
> 
> Hmm,tuples of multiple version in a table ?
> This is neither clean nor easy for me.
> There's no such stuff which takes the case into account,AFAIK.
> 
> Seems no one but me object to it. I'm tired of this issue and it's
> painful for me to continue discussion further in my poor English.
> I may be able to provide another implementation on trial and it
> may be easier than only objecting to your proposal.
> Is it OK ?

Sure, whatever you want.  No one is going to start coding anything for a
while.  Seemed like a clean solution with no rename() problems.


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> 
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> > Hmm,tuples of multiple version in a table ?
> > This is neither clean nor easy for me.
> 
> I'm worried about it too.  I think it could maybe be made to work,
> but it seems fragile.
> 
> > I may be able to provide another implementation on trial and it
> > may be easier than only objecting to your proposal.
> 
> If you have a better idea, let's hear it!
>

I don't want a final implementation this time.
What I want is to provide a quick hack for both others and me
to judge whether this direction is good or not.

My idea is essentially an invisible column implementation.
DROP COLUMN would change the target pg_attribute tuple
as follows..attnum  ->  an offset - attnum;atttypid ->  0

We would be able to see where to change by tracking error/
crashes caused by this change.

I would also change attname to '*already dropped %d' for
examle to avoid duplicate attname. 
Regards.

Hiroshi Inoue
Inoue@tpf.co.jp 


RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Mon, 28 Feb 2000, Hiroshi Inoue wrote:

> > -----Original Message-----
> > From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> > 
> > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> > > Hmm,tuples of multiple version in a table ?
> > > This is neither clean nor easy for me.
> > 
> > I'm worried about it too.  I think it could maybe be made to work,
> > but it seems fragile.
> > 
> > > I may be able to provide another implementation on trial and it
> > > may be easier than only objecting to your proposal.
> > 
> > If you have a better idea, let's hear it!
> >
> 
> I don't want a final implementation this time.
> What I want is to provide a quick hack for both others and me
> to judge whether this direction is good or not.
> 
> My idea is essentially an invisible column implementation.
> DROP COLUMN would change the target pg_attribute tuple
> as follows..
>     
>     attnum  ->  an offset - attnum;
>     atttypid ->  0
> 
> We would be able to see where to change by tracking error/
> crashes caused by this change.
> 
> I would also change attname to '*already dropped %d' for
> examle to avoid duplicate attname. 

Okay, just curious here, but ... what you are proposing *sounds* to me
like half-way to what started this thread. (*Please* correct me if I'm
wrong) ...

Essentially, in your proposal, when you drop a column, all subsequent
tuples inserted/updated would have ... that one column missing?  So,
instead of doing a massive sweep through the table and removing that
column, only do it when an insert/update happens? 

Basically, eliminate the requirement to re-write every tuples, only those
that have activity?




Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
> > I would also change attname to '*already dropped %d' for
> > examle to avoid duplicate attname. 
> 
> Okay, just curious here, but ... what you are proposing *sounds* to me
> like half-way to what started this thread. (*Please* correct me if I'm
> wrong) ...
> 
> Essentially, in your proposal, when you drop a column, all subsequent
> tuples inserted/updated would have ... that one column missing?  So,
> instead of doing a massive sweep through the table and removing that
> column, only do it when an insert/update happens? 
> 
> Basically, eliminate the requirement to re-write every tuples, only those
> that have activity?

And I think the problem was that there was too much code to modify to
allow this.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: The Hermit Hacker [mailto:scrappy@hub.org]
> 
> On Mon, 28 Feb 2000, Hiroshi Inoue wrote:
> 
> > > -----Original Message-----
> > > From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> > > 
> > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> > > > Hmm,tuples of multiple version in a table ?
> > > > This is neither clean nor easy for me.
> > > 
> > > I'm worried about it too.  I think it could maybe be made to work,
> > > but it seems fragile.
> > > 
> > > > I may be able to provide another implementation on trial and it
> > > > may be easier than only objecting to your proposal.
> > > 
> > > If you have a better idea, let's hear it!
> > >
> >

[snip]
> 
> Okay, just curious here, but ... what you are proposing *sounds* to me
> like half-way to what started this thread. (*Please* correct me if I'm
> wrong) ...
>

My proposal is essentially same as what I proposed once in this thread.
I don't think DROP COLUMN feature is very important.
DROP/ADD CONSTRAINT feature seems much more important.  
Why do you want a heavy iplementation like vacuum after 2x disk
usage for this feature ?
My implementation won't touch the target table at all and would never
remove dropped columns practically. It would only make them invisible
and NULL would be set for newly insert/updated columns.

If you want a really clean table for DROP TABLE command,my
proposal is useless.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp


RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 11:40 PM 2/27/00 -0400, The Hermit Hacker wrote:

>Okay, just curious here, but ... what you are proposing *sounds* to me
>like half-way to what started this thread. (*Please* correct me if I'm
>wrong) ...
>
>Essentially, in your proposal, when you drop a column, all subsequent
>tuples inserted/updated would have ... that one column missing?  So,
>instead of doing a massive sweep through the table and removing that
>column, only do it when an insert/update happens? 
>
>Basically, eliminate the requirement to re-write every tuples, only those
>that have activity?

Yes, this was one of the ideas that cropped up in previous discussion.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 12:21 PM 2/28/00 +0900, Hiroshi Inoue wrote:

>My idea is essentially an invisible column implementation.
>DROP COLUMN would change the target pg_attribute tuple
>as follows..

I don't see such a solution as being mutually exclusive with
the other one on the table.

Remember ... Oracle provides both.  I suspect that they did so
because they were under customer pressure to provide a "real"
column drop and a "fast" (and non-2x tablesize!) solution.  So
they did both.  Also keep in mind that being able to drop a
column in Oracle is a year 1999 feature ... and both are provided.
More evidence of pressure from two points of view.

Of course, PG suffers because the "real" column drop is a 2x
space solution, so the "invisibility" approach may more frequently
be desired.

Still... as time goes on and PG gets adopted by more and more 
serious, large-scale users (which we all are working towards, 
right?) I suspect that each camp will want to be served.




- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
> 
> > > I would also change attname to '*already dropped %d' for
> > > examle to avoid duplicate attname. 
> > 
> > Okay, just curious here, but ... what you are proposing *sounds* to me
> > like half-way to what started this thread. (*Please* correct me if I'm
> > wrong) ...
> > 
> > Essentially, in your proposal, when you drop a column, all subsequent
> > tuples inserted/updated would have ... that one column missing?  So,
> > instead of doing a massive sweep through the table and removing that
> > column, only do it when an insert/update happens? 
> > 
> > Basically, eliminate the requirement to re-write every tuples, 
> only those
> > that have activity?
> 
> And I think the problem was that there was too much code to modify to
> allow this.
>

Seems my trial would be useless.
I would give up the trial.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp


Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
wieck@debis.com (Jan Wieck)
Date:
Hiroshi Inoue wrote:

> > > Wow, that almost seems to easy to be true. I never thought that having
> > > tuples of different structures in the table at the same time would be
> > > possible. If so then I don't see a reason why this would be too hard to
> > > do.
> >
> > If the transaction is not committed, I don't think anything actually
> > reads the tuple columns, so you are safe.
> >
>
> Hmm,tuples of multiple version in a table ?
> This is neither clean nor easy for me.
> There's no such stuff which takes the case into account,AFAIK.
>
> Seems no one but me object to it. I'm tired of this issue and it's
> painful for me to continue discussion further in my poor English.
> I may be able to provide another implementation on trial and it
> may be easier than only objecting to your proposal.
> Is it OK ?
   Consider me on your side.
   For some good reasons, I added a
       ReferentialIntegritySnapshotOverride
   mode,  that  causes  any  tuple to be visible when fetched by   CTID.  Actually, there will be at least a read lock
on them,   so  locking  will  prevent  damage.  But I can think of other   situations where this kind of "read whatever
Iwant  you  to"   could be needed and would fail then.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #




Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Hannu Krosing
Date:
Don Baccus wrote:
> 
> At 12:21 PM 2/28/00 +0900, Hiroshi Inoue wrote:
> 
> >My idea is essentially an invisible column implementation.
> >DROP COLUMN would change the target pg_attribute tuple
> >as follows..
> 
> I don't see such a solution as being mutually exclusive with
> the other one on the table.

Very true, and we will need the hidden columns feature for a clean 
implementation of inheritance anyway.

> Remember ... Oracle provides both.  I suspect that they did so
> because they were under customer pressure to provide a "real"
> column drop and a "fast" (and non-2x tablesize!) solution.  So
> they did both.  Also keep in mind that being able to drop a
> column in Oracle is a year 1999 feature ... and both are provided.
> More evidence of pressure from two points of view.
> 
> Of course, PG suffers because the "real" column drop is a 2x
> space solution, so the "invisibility" approach may more frequently
> be desired.

"update t set id=id+1" is also a 2x space, likely even more if 
referential inheritance is used (and checked at the end of transaction)

And my main use of DROP COLUMN will probably be during development, 
usually meaning small table sizes.

------------
Hannu


RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Mon, 28 Feb 2000, Hiroshi Inoue wrote:

> > -----Original Message-----
> > From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
> > 
> > > > I would also change attname to '*already dropped %d' for
> > > > examle to avoid duplicate attname. 
> > > 
> > > Okay, just curious here, but ... what you are proposing *sounds* to me
> > > like half-way to what started this thread. (*Please* correct me if I'm
> > > wrong) ...
> > > 
> > > Essentially, in your proposal, when you drop a column, all subsequent
> > > tuples inserted/updated would have ... that one column missing?  So,
> > > instead of doing a massive sweep through the table and removing that
> > > column, only do it when an insert/update happens? 
> > > 
> > > Basically, eliminate the requirement to re-write every tuples, 
> > only those
> > > that have activity?
> > 
> > And I think the problem was that there was too much code to modify to
> > allow this.
> >
> 
> Seems my trial would be useless.
> I would give up the trial.

Hiroshi ...Bruce's comment was just an observation ... if it can be done
cleanly, I would love to see a version that didn't involve 2x the disk
space ... I don't believe that a trial would be useless, I think that
Bruce's only concern/warning is that the amount of code modifications that
would have to be made in order to accomplish this *might* be larger then
the benefit resulting in doing it this way.
If you feel that this can be done more efficiently, *please*
proceed with the trial ... 
I'm curious about one thing ... several ppl have mentioned that
Oracle does it "both ways" ... does anyone know the syntax they use so
that someone can do it one way or another?

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 11:45 AM 2/28/00 +0200, Hannu Krosing wrote:

>"update t set id=id+1" is also a 2x space,

And PG doesn't do it correctly anyway...

> likely even more if 
>referential inheritance is used (and checked at the end of transaction)

The triggers are all queued so yes, take memory too.  Even better,
if "MATCH <unspecified>" or especially "MATCH PARTIAL" is used with
multi-column foreign keys containing nulls, it will be impressively slow!
We can call these the built-in coffee break feature when used on large
tables.

(it's inherently slow, not just slow because of the PG implementation)

>And my main use of DROP COLUMN will probably be during development, 
>usually meaning small table sizes.

Well, folks who use the web toolkit I've been porting for Oracle will
have a use for it, too, because the toolkit has been rapidly evolving
(ArsDigita has about 70 employees at the moment, most of them programmers
working on the Oracle-based version of the toolkit).  ArsDigita provides
upgrade .sql files for each version that consist in part of ADD/DROP
column statements so users can upgrade in place, a very useful thing.

It doesn't need to be fast in this context, just work.  You tell the world
your site will be down for an evening on such-and-such date, stop 
listening on port 80, and upgrade.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
Don Baccus <dhogaza@pacifier.com> writes:
>> "update t set id=id+1" is also a 2x space,

> And PG doesn't do it correctly anyway...

? News to me.  What's your definition of "correctly"?
        regards, tom lane


Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 10:20 AM 2/28/00 -0500, Tom Lane wrote:
>Don Baccus <dhogaza@pacifier.com> writes:
>>> "update t set id=id+1" is also a 2x space,
>
>> And PG doesn't do it correctly anyway...
>
>? News to me.  What's your definition of "correctly"?

create table foo(i integer unique);

(insert values)

donb=# select * from foo;i 
---123
(3 rows)

donb=# update foo set i=i+1;
ERROR:  Cannot insert a duplicate key into unique index foo_pkey

Shouldn't fail ... the constraint should be applied after the
update, but the row-by-row update of the index causes it to fail.
At least I presume that this is an artifact of PG implementing the
unique constraint by creating a unique index.

Stephan Szabo pointed this out to me awhile ago when we were
discussing "alter table add constraint" (he was looking into
this when he worked on "alter table add foreign key").

Of course, sometimes PG gets it right.  I deleted stuff in foo,
then did:

donb=# insert into foo values(3);
INSERT 26907 1
donb=# insert into foo values(2);
INSERT 26908 1
donb=# insert into foo values(1);
INSERT 26909 1
donb=# update foo set i=i+1;
UPDATE 3
donb=# 






- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Hannu Krosing
Date:
Don Baccus wrote:
> 
> At 10:20 AM 2/28/00 -0500, Tom Lane wrote:
> >Don Baccus <dhogaza@pacifier.com> writes:
> >>> "update t set id=id+1" is also a 2x space,
> >
> >> And PG doesn't do it correctly anyway...
> >
> >? News to me.  What's your definition of "correctly"?
> 
> create table foo(i integer unique);
> 
> (insert values)
> 
> donb=# select * from foo;
>  i
> ---
>  1
>  2
>  3
> (3 rows)
> 
> donb=# update foo set i=i+1;
> ERROR:  Cannot insert a duplicate key into unique index foo_pkey

I knew it used to misbehave that way, but at some point I got the 
impression that it was fixed ;(

IIRC, the same behaviour plagued the old foreign key implementation 
in contrib, which was why it was refused for a long time to be 
integrated.

I hope that at least the foreig keys don't do it anymore.

---------
Hannu


Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
wieck@debis.com (Jan Wieck)
Date:
Hannu Krosing wrote:

> Don Baccus wrote:
>
> > donb=# update foo set i=i+1;
> > ERROR:  Cannot insert a duplicate key into unique index foo_pkey
>
> IIRC, the same behaviour plagued the old foreign key implementation
> in contrib, which was why it was refused for a long time to be
> integrated.
>
> I hope that at least the foreig keys don't do it anymore.
   ALL  the  FK  triggers  are  delayed  until  after the entire   statement (what's wrong for ON DELETE RESTRICT -
but that's   another  story), or until the entire transaction (in deferred   mode).
 
   But the UNIQUE constraint is still built upon  unique  nbtree   indices,  thus  failing  on  primary  key where such
aunique   index is automatically created for.
 
   I'm far too less familiar with our implementation  of  nbtree   to  tell  whether it would be possible at all to
delayunique   checking until statement end  or  XACT  commit.  At  least  I   assume  it  would  require some similar
techniqueof deferred   queue.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #




Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 02:04 AM 2/29/00 +0200, Hannu Krosing wrote:

>IIRC, the same behaviour plagued the old foreign key implementation 
>in contrib, which was why it was refused for a long time to be 
>integrated.
>
>I hope that at least the foreig keys don't do it anymore.

It shouldn't because they're implemented via triggers after all the
work is done.  In other words, the implementation might have bugs
but the bugs should be different :)



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 01:43 AM 2/29/00 +0100, Jan Wieck wrote:

>    ALL  the  FK  triggers  are  delayed  until  after the entire
>    statement (what's wrong for ON DELETE RESTRICT -  but  that's
>    another  story), or until the entire transaction (in deferred
>    mode).

Kind of wrong, just so folks understand the semantics are right in
the sense that the right answer is given (pass or fail) - you need
a stopwatch to know that we're not doing what the SQL3 suggests
should be done (catch the foreign key errors before changes are made
and without incurring the cost of a rollback).

The current way we're doing it - identically to "NO ACTION" is
fine for compatability purposes, though later we'd like to implement
a smart ON DELETE RESTRICT because the efficiency considerations
that led to its inclusion in SQL3 are reasonable ones.

>    I'm far too less familiar with our implementation  of  nbtree
>    to  tell  whether it would be possible at all to delay unique
>    checking until statement end  or  XACT  commit.  At  least  I
>    assume  it  would  require some similar technique of deferred
>    queue.

Presumably you'd queue up per-row triggers just like for FK constraints
and insert into the unique index at that point.

I have no idea how many other things this would break, if any.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
sszabo@bigpanda.com
Date:
>    ALL  the  FK  triggers  are  delayed  until  after the entire
>    statement (what's wrong for ON DELETE RESTRICT -  but  that's
>    another  story), or until the entire transaction (in deferred
>    mode).
>
>    But the UNIQUE constraint is still built upon  unique  nbtree
>    indices,  thus  failing  on  primary  key where such a unique
>    index is automatically created for.
>
>    I'm far too less familiar with our implementation  of  nbtree
>    to  tell  whether it would be possible at all to delay unique
>    checking until statement end  or  XACT  commit.  At  least  I
>    assume  it  would  require some similar technique of deferred
>    queue.

We might want to look at what we're doing for all of the constraints,
because at some point we'll probably want to let you defer the other
constraints as well (I'm pretty sure this technically legal in SQL92).
If we can think of a good way to handle all of the constraints together
that might be worth doing to prevent us from coding the same thing
multiple times.



Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
wieck@debis.com (Jan Wieck)
Date:
Don Baccus wrote:

> At 01:43 AM 2/29/00 +0100, Jan Wieck wrote:
>
> >    ALL  the  FK  triggers  are  delayed  until  after the entire
> >    statement (what's wrong for ON DELETE RESTRICT -  but  that's
> >    another  story), or until the entire transaction (in deferred
> >    mode).
>
> Kind of wrong, just so folks understand the semantics are right in
> the sense that the right answer is given (pass or fail) - you need
> a stopwatch to know ...
   Explanative  version  of "that other story".  But not exactly   correct IMHO. If following strictly SQL3
suggestions, an  ON   DELETE  RESTRICT  action cannot be deferrable at all. Even if   the constraint itself is
deferrableand is set explicitly  to   DEFERRED,  the check should be done immediately at ROW level.   That's the
differencebetween "NO ACTION" and "RESTRICT".
 
   Actually,  a  RESTRICT  violation  can   potentially   bypass   thousands  of  subsequent  queries  until COMMIT.
Meaningless  from  the  transactional  PoV,  but  from   the   application   programmers  one  (looking at the return
codeof a particular   statement) it isn't!
 

> >    I'm far too less familiar with our implementation  of  nbtree
> >    to  tell  whether it would be possible at all to delay unique
> >    checking until statement end  or  XACT  commit.  At  least  I
> >    assume  it  would  require some similar technique of deferred
> >    queue.
>
> Presumably you'd queue up per-row triggers just like for FK constraints
> and insert into the unique index at that point.
>
> I have no idea how many other things this would break, if any.
   At least if deferring the index insert until XACT commit, any   subsequent  index  scan wouldn't see inserted
tuples,even if   they MUST be visible.
 
   Maybe I'm less far away from knowledge than thought.   Inside   of  a  nbtree-index,  any  number  of duplicates is
accepted.  It's the heap tuples visibility they point to, that  triggers   the dup message.
 
   So  it's  definitely  some kind of "accept duplicates for now   but check for final dup's on this key later".
   But that requires another index scan later. We  can  remember   the  relations  and indices Oid (to get back the
relationand   index   in   question)   plus   the   CTID   of   the   added   (inserted/updated   tuple)   to   get
back the  key  values   (remembering the key itself could blow up memory). Then do an   index   scan   under   current
(statement end/XACT  commit)   visibility to check if more than one HeapTupleSatisfies().
 
   It'll be expensive, compared to current UNIQUE implementation   doing  it  on  the fly during btree insert (doesn't
it?).But   the only way I see.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #




Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 03:24 AM 2/29/00 +0100, Jan Wieck wrote:

>    Explanative  version  of "that other story".  But not exactly
>    correct IMHO. If following strictly SQL3 suggestions,  an  ON
>    DELETE  RESTRICT  action cannot be deferrable at all. Even if
>    the constraint itself is deferrable and is set explicitly  to
>    DEFERRED,  the check should be done immediately at ROW level.
>    That's the difference between "NO ACTION" and "RESTRICT".
>
>    Actually,  a  RESTRICT  violation  can   potentially   bypass
>    thousands  of  subsequent  queries  until COMMIT. Meaningless
>    from  the  transactional  PoV,  but  from   the   application
>    programmers  one  (looking at the return code of a particular
>    statement) it isn't!

No, strictly speaking it isn't correct.  But without a stopwatch,
it will be hard to tell.

Actually, though, since exceptions are only supposed to reject
the given SQL-statement and not trigger a PG-style auto-rollback of
the transaction, a subsequent "commit" should commit that subsequent
work (unless they in turn trigger constraint errors due to dependencies
on the first failed constraint).

So you don't really get to skip all those subsequent statements unless
you're looking for the exception, catch it, and do an explicit rollback.

None of that is in place in PG anyway at the moment...

I'm assuming that the exception raised for an FK violation is the
same as an exception raised for numeric overflow, etc - I think 
you missed that earlier discussion.

The fact that PG's auto-rollback is wrong was news to me, though
obvious in hindsight, and I've not gone back to study RI semantics
in light of this new information.

So I may be wrong, here.  

We could always take out "RESTRICT" and claim SQL92 rather than SQL3
referential integrity :) :)

Given that Oracle only implements "MATCH <unspecified>" (as of 8.1.5,
anyway), we're not doing too bad!

>
>> >    I'm far too less familiar with our implementation  of  nbtree
>> >    to  tell  whether it would be possible at all to delay unique
>> >    checking until statement end  or  XACT  commit.  At  least  I
>> >    assume  it  would  require some similar technique of deferred
>> >    queue.
>>
>> Presumably you'd queue up per-row triggers just like for FK constraints
>> and insert into the unique index at that point.
>>
>> I have no idea how many other things this would break, if any.
>
>    At least if deferring the index insert until XACT commit, any
>    subsequent  index  scan wouldn't see inserted tuples, even if
>    they MUST be visible.

Ugh, of course :(

>    Maybe I'm less far away from knowledge than thought.   Inside
>    of  a  nbtree-index,  any  number  of duplicates is accepted.
>    It's the heap tuples visibility they point to, that  triggers
>    the dup message.
>
>    So  it's  definitely  some kind of "accept duplicates for now
>    but check for final dup's on this key later".
>
>    But that requires another index scan later. We  can  remember
>    the  relations  and indices Oid (to get back the relation and
>    index   in   question)   plus   the   CTID   of   the   added
>    (inserted/updated   tuple)   to   get  back  the  key  values
>    (remembering the key itself could blow up memory). Then do an
>    index   scan   under   current  (statement  end/XACT  commit)
>    visibility to check if more than one HeapTupleSatisfies().
>
>    It'll be expensive, compared to current UNIQUE implementation
>    doing  it  on  the fly during btree insert (doesn't it?). But
>    the only way I see.

The more I learn about SQL92 the more I understand why RDBMS systems
have the reputation for being piggy.  But, the standard semantics
of UPDATE on a column with a UNIQUE constraint are certainly consistent
with the paradigm that queries operate on sets of tuples, not sequences
of tuples.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: The Hermit Hacker [mailto:scrappy@hub.org]
>

[snip]

> Hiroshi ...
>
>     Bruce's comment was just an observation ... if it can be done
> cleanly, I would love to see a version that didn't involve 2x the disk
> space ... I don't believe that a trial would be useless, I think that
> Bruce's only concern/warning is that the amount of code modifications that
> would have to be made in order to accomplish this *might* be larger then
> the benefit resulting in doing it this way.
>
>     If you feel that this can be done more efficiently, *please*
> proceed with the trial ...
>

OK,I may be able to provide a trial patch in a week or so if I'm lucky.
How to commit the patch ?
With #ifdef ?

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp




Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
wieck@debis.com (Jan Wieck) writes:
>     So  it's  definitely  some kind of "accept duplicates for now
>     but check for final dup's on this key later".

>     But that requires another index scan later. We  can  remember
>     the  relations  and indices Oid (to get back the relation and
>     index   in   question)   plus   the   CTID   of   the   added
>     (inserted/updated   tuple)   to   get  back  the  key  values
>     (remembering the key itself could blow up memory). Then do an
>     index   scan   under   current  (statement  end/XACT  commit)
>     visibility to check if more than one HeapTupleSatisfies().

>     It'll be expensive, compared to current UNIQUE implementation
>     doing  it  on  the fly during btree insert (doesn't it?). But
>     the only way I see.

How about:

1. During INSERT into unique index, notice whether any other index
entries have same key.  If so, add that key value to a queue of
possibly-duplicate keys to check later.

2. At commit, or whenever consistency should be checked, scan the
queue.  For each entry, use the index to look up all the matching
tuples, and check that only one will be valid if the transaction
commits.

This avoids a full index scan in the normal case, although it could
be pretty slow in the update-every-tuple scenario...
        regards, tom lane


RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Tue, 29 Feb 2000, Hiroshi Inoue wrote:

> > -----Original Message-----
> > From: The Hermit Hacker [mailto:scrappy@hub.org]
> >
> 
> [snip]
> 
> > Hiroshi ...
> >
> >     Bruce's comment was just an observation ... if it can be done
> > cleanly, I would love to see a version that didn't involve 2x the disk
> > space ... I don't believe that a trial would be useless, I think that
> > Bruce's only concern/warning is that the amount of code modifications that
> > would have to be made in order to accomplish this *might* be larger then
> > the benefit resulting in doing it this way.
> >
> >     If you feel that this can be done more efficiently, *please*
> > proceed with the trial ...
> >
> 
> OK,I may be able to provide a trial patch in a week or so if I'm lucky.
> How to commit the patch ?
> With #ifdef ?

Nope, but it will have to wait until *after* 7.0 is released, so don't
push yourself on it ... 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: The Hermit Hacker [mailto:scrappy@hub.org]
>
> On Tue, 29 Feb 2000, Hiroshi Inoue wrote:
>
> > > -----Original Message-----
> > > From: The Hermit Hacker [mailto:scrappy@hub.org]
> > >
> >
> > [snip]
> >
> > > Hiroshi ...
> > >
> > >     Bruce's comment was just an observation ... if it can be done
> > > cleanly, I would love to see a version that didn't involve 2x the disk
> > > space ... I don't believe that a trial would be useless, I think that
> > > Bruce's only concern/warning is that the amount of code
> modifications that
> > > would have to be made in order to accomplish this *might* be
> larger then
> > > the benefit resulting in doing it this way.
> > >
> > >     If you feel that this can be done more efficiently, *please*
> > > proceed with the trial ...
> > >
> >
> > OK,I may be able to provide a trial patch in a week or so if I'm lucky.
> > How to commit the patch ?
> > With #ifdef ?
>
> Nope, but it will have to wait until *after* 7.0 is released, so don't
> push yourself on it ...
>

Hmm,until 7.0 release ?
I don't want to keep my private branch so long.
Is #ifdef bad to separate it from 7.0 release stuff ?

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp



Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Hannu Krosing
Date:
Jan Wieck wrote:
> 
>     Explanative  version  of "that other story".  But not exactly
>     correct IMHO. If following strictly SQL3 suggestions,  an  ON
>     DELETE  RESTRICT  action cannot be deferrable at all. Even if
>     the constraint itself is deferrable and is set explicitly  to
>     DEFERRED,  the check should be done immediately at ROW level.
>     That's the difference between "NO ACTION" and "RESTRICT".
> 
>     Actually,  a  RESTRICT  violation  can   potentially   bypass
>     thousands  of  subsequent  queries  until COMMIT. Meaningless
>     from  the  transactional  PoV,  but  from   the   application
>     programmers  one  (looking at the return code of a particular
>     statement) it isn't!
...
>     It'll be expensive, compared to current UNIQUE implementation
>     doing  it  on  the fly during btree insert (doesn't it?). But
>     the only way I see.

So currently we have ON UPDATE RESTRICT foreign keys :)

-------------
Hannu


Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Hannu Krosing
Date:
Hannu Krosing wrote:
> 
> Jan Wieck wrote:
> >
> >     It'll be expensive, compared to current UNIQUE implementation
> >     doing  it  on  the fly during btree insert (doesn't it?). But
> >     the only way I see.
> 
> So currently we have foreign keys :)

I meant of course ON UPDATE RESTRICT PRIMARY KEYS ..

----------
Hannu


Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
wieck@debis.com (Jan Wieck)
Date:
Don Baccus wrote:

> At 03:24 AM 2/29/00 +0100, Jan Wieck wrote:
>
> >    Actually,  a  RESTRICT  violation  can   potentially   bypass
> >    thousands  of  subsequent  queries  until COMMIT. Meaningless
> >    from  the  transactional  PoV,  but  from   the   application
> >    programmers  one  (looking at the return code of a particular
> >    statement) it isn't!
>
> No, strictly speaking it isn't correct.  But without a stopwatch,
> it will be hard to tell.
   It is easy to tell:
       CREATE TABLE t1 (a integer PRIMARY KEY);       CREATE TABLE t2 (a integer REFERENCES t1
       ON DELETE RESTRICT                                  DEFERRABLE);
 
       INSERT INTO t1 VALUES (1);       INSERT INTO t1 VALUES (2);       INSERT INTO t1 VALUES (3);
       INSERT INTO t2 VALUES (1);       INSERT INTO t2 VALUES (2);
       BEGIN TRANSACTION;       SET CONSTRAINTS ALL DEFERRED;       DELETE FROM t1 WHERE a = 2;       DELETE FROM t1
WHEREa = 3;       COMMIT TRANSACTION;
 
   In  this case, the first DELETE from t1 must already bomb the   exception, setting the transaction block into error
stateand   reject  all  further  queries  until COMMIT/ROLLBACK. The SET   DEFERRED should only affect a  check  for
key existance  on   INSERT to t2, not the RESTRICT action on DELETE to t1.
 
   The  end  result  will  be  the same, both DELETEs get rolled   back. But the application will see it at COMMIT, not
at  the   first  DELETE.  So  the  system  behaves  exactly like for NO   ACTION.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #




Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
wieck@debis.com (Jan Wieck)
Date:
Hannu Krosing wrote:

> Jan Wieck wrote:
> >
> >     Actually,  a  RESTRICT  violation  can   potentially   bypass
> >     thousands  of  subsequent  queries  until COMMIT. Meaningless
> >     from  the  transactional  PoV,  but  from   the   application
> >     programmers  one  (looking at the return code of a particular
> >     statement) it isn't!
> ...
> >     It'll be expensive, compared to current UNIQUE implementation
> >     doing  it  on  the fly during btree insert (doesn't it?). But
> >     the only way I see.
>
> So currently we have ON UPDATE RESTRICT foreign keys :)
   For  foreign keys we actually have ON UPDATE/DELETE NO ACTION   (plus SET NULL and SET  DEFAULT).  Only  the
RESTRICT isn't   fully  SQL3.  I just had an idea that might easily turn it to   do the right thing.
 
   For the  UNIQUE  constraint,  it's  totally  wrong  (and  not   related  to  FOREIGN KEY stuff at all). The UNIQUE
constraint  isn't deferrable at all, and it looks for violations on a per   row level, not on a per set level as it
should.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #




Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
wieck@debis.com (Jan Wieck)
Date:
I wrote:

>     fully  SQL3.  I just had an idea that might easily turn it to
>     do the right thing.
   ON  <event>  RESTRICT  triggers  are  now  executed after the   statement allways, ignoring any explicitly set
deferredmode.
 
   This  is  pretty close to Date's SQL3 interpretation, or IMHO   better. Date says that they are checked BEFORE each
ROW, but   that  would  ignore  the SET character of a statement. Now we   have  correct  semantics  for  all  4
possible  referential   actions.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #




RE: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Tue, 29 Feb 2000, Hiroshi Inoue wrote:

> > Nope, but it will have to wait until *after* 7.0 is released, so don't
> > push yourself on it ...
> >
> 
> Hmm,until 7.0 release ?
> I don't want to keep my private branch so long.
> Is #ifdef bad to separate it from 7.0 release stuff ?

Go for it and submit a patch ... if its totally innoculous, then we can
try and plug her in, but I won't guarantee it.  Since there should be no
major changes between now and 7.0 release, we can store any patch until
the release also ...




Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 01:22 PM 2/29/00 +0100, Jan Wieck wrote:
>I wrote:
>
>>     fully  SQL3.  I just had an idea that might easily turn it to
>>     do the right thing.
>
>    ON  <event>  RESTRICT  triggers  are  now  executed after the
>    statement allways, ignoring any explicitly set deferred mode.
>    This  is  pretty close to Date's SQL3 interpretation, or IMHO
>    better. Date says that they are checked BEFORE each ROW,  but
>    that  would  ignore  the SET character of a statement.

Perhaps that's actually the point of RESTRICT?  Sacrifice the 
set character of a statement in this special case in order to
return an error quickly?

Since RESTRICT wasn't in SQL92, and since it's very close to
NO ACTION, it reeks of being an efficiency hack.  

I dread digging into that part of the standard again...this is
a case where the various proposals and justifications that were
before the committee at the time would be useful since the 
actual words that made it to the standard are opaque.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] Re: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 11:22 AM 2/29/00 +0100, Jan Wieck wrote:
>Don Baccus wrote:
>
>> At 03:24 AM 2/29/00 +0100, Jan Wieck wrote:
>>
>> >    Actually,  a  RESTRICT  violation  can   potentially   bypass
>> >    thousands  of  subsequent  queries  until COMMIT. Meaningless
>> >    from  the  transactional  PoV,  but  from   the   application
>> >    programmers  one  (looking at the return code of a particular
>> >    statement) it isn't!
>>
>> No, strictly speaking it isn't correct.  But without a stopwatch,
>> it will be hard to tell.
>
>    It is easy to tell:
>
>        CREATE TABLE t1 (a integer PRIMARY KEY);
>        CREATE TABLE t2 (a integer REFERENCES t1
>                                   ON DELETE RESTRICT
>                                   DEFERRABLE);
>
>        INSERT INTO t1 VALUES (1);
>        INSERT INTO t1 VALUES (2);
>        INSERT INTO t1 VALUES (3);
>
>        INSERT INTO t2 VALUES (1);
>        INSERT INTO t2 VALUES (2);
>
>        BEGIN TRANSACTION;
>        SET CONSTRAINTS ALL DEFERRED;
>        DELETE FROM t1 WHERE a = 2;
>        DELETE FROM t1 WHERE a = 3;
>        COMMIT TRANSACTION;
>
>    In  this case, the first DELETE from t1 must already bomb the
>    exception, setting the transaction block into error state and
>    reject  all  further  queries  until COMMIT/ROLLBACK.

Ahhh...but the point you're missing, which was brought up a few
days ago, is that this PG-ism of rejecting all further queries
until COMMIT/ROLLBACK is in itself NONSTANDARD.

As far as the effect of DEFERRED on RESTRICT with STANDARD, not
PG, transaction semantics I've not investigated it.  Neither one
of us has a particularly great record at correctly interpreting
the SQL3 standard regarding the subtleties of foreign key semantics,
since we both had differing interpretations of RESTRICT/NO ACTION
and (harumph) we were BOTH wrong :)  Date implies that there's
no difference other than RESTRICT's returning an error more quickly,
but he doesn't talk about the DEFERRED case.

Anyway, it's moot at the moment since neither RESTRICT nor standard
SQL92 transaction semantics are implemented.

>    The  end  result  will  be  the same,

Which is what I mean when I say you pretty much need a stopwatch
to tell the difference - OK, in PG you can look at the non-standard
error messages due to the non-standard rejection of subsequent
queries, but I was thinking in terms of standard transaction
semantics.

> both DELETEs get rolled
>    back. But the application will see it at COMMIT, not  at  the
>    first  DELETE.  So  the  system  behaves  exactly like for NO
>    ACTION.

Yes.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
Seems we have 4 DROP COLUMN ideas:
Method
Advantage-----------------------------------------------------------------
1    invisible column marked by negative attnum        fast
2    invisible column marked by is_dropped column        fast
3    make copy of table without column            col removed
4    make new tuples in existing table without column    col removed

Folks, we had better choose one and get started.  

Number 1 Hiroshi has ifdef'ed out in the code.  Items 1 and 2 have
problems with backend code and 3rd party code not seeing the dropped
columns, or having gaps in the attno numbering. Number 3 has problems
with making it an atomic operation, and number 4 is described below. 

---------------------------------------------------------------------------

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > You can exclusively lock the table, then do a heap_getnext() scan over
> > the entire table, remove the dropped column, do a heap_insert(), then a
> > heap_delete() on the current tuple, making sure to skip over the tuples
> > inserted by the current transaction.  When completed, remove the column
> > from pg_attribute, mark the transaction as committed (if desired), and
> > run vacuum over the table to remove the deleted rows.
> 
> Hmm, that would work --- the new tuples commit at the same instant that
> the schema updates commit, so it should be correct.  You have the 2x
> disk usage problem, but there's no way around that without losing
> rollback ability.
> 
> A potentially tricky bit will be persuading the tuple-reading and tuple-
> writing subroutines to pay attention to different versions of the tuple
> structure for the same table.  I haven't looked to see if this will be
> difficult or not.  If you can pass the TupleDesc explicitly then it
> shouldn't be a problem.
> 
> I'd suggest that the cleanup vacuum *not* be an automatic part of
> the operation; just recommend that people do it ASAP after dropping
> a column.  Consider needing to drop several columns...
> 
>             regards, tom lane
> 
> ************
> 


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: ALTER TABLE DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@hub.org 
> [mailto:pgsql-hackers-owner@hub.org]On Behalf Of Bruce Momjian
> 
> Seems we have 4 DROP COLUMN ideas:
> 
>     Method                                                  Advantage
>     -----------------------------------------------------------------
> 1    invisible column marked by negative attnum        fast
> 2    invisible column marked by is_dropped column        fast
> 3    make copy of table without column            col removed
> 4    make new tuples in existing table without column    col removed
> 
> Folks, we had better choose one and get started.  
> 
> Number 1 Hiroshi has ifdef'ed out in the code.  Items 1 and 2 have
> problems with backend code and 3rd party code not seeing the dropped
> columns,

Hmm,doesn't  *not seeing*  mean the column is dropped ?

> or having gaps in the attno numbering. Number 3 has problems
> with making it an atomic operation, and number 4 is described below. 
>

Don't forget another important point.

Currently even DROP TABLE doesn't remove related objects completely.
And I don't think I could remove objects related to the dropping column
completely using 1)2) in ALTER TABLE DROP COLUMN implementation.

Using 3)4) we should not only remove objects as 1)2) but also
change attnum-s in all objects related to the relation. Otherwise
PostgreSQL would do the wrong thing silently.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp 


RE: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 01:43 PM 6/10/00 +0900, Hiroshi Inoue wrote:
>> -----Original Message-----
>> From: pgsql-hackers-owner@hub.org 
>> [mailto:pgsql-hackers-owner@hub.org]On Behalf Of Bruce Momjian
>> 
>> Seems we have 4 DROP COLUMN ideas:
>> 
>>     Method                                                  Advantage
>>     -----------------------------------------------------------------
>> 1    invisible column marked by negative attnum        fast
>> 2    invisible column marked by is_dropped column        fast
>> 3    make copy of table without column            col removed
>> 4    make new tuples in existing table without column    col removed
>> 
>> Folks, we had better choose one and get started.  

Oracle gives you the choice between the "cheating" fast method(s) and
the "real" slow (really slow?) real method.

So there's at least real world experience by virtue of example by
the world's most successful database supplier that user control
over "hide the column" and "really delete the column" is valuable.

It really makes a lot of sense to give such a choice.  If one
does so by "hiding", at a later date one would think the choice
of "really deleting" would be a possibility.  I don't know if
Oracle does this...

If not, they might not care.  In today's world, there are bazillions
of dollars for Oracle to scoop up from users who could just as easily
be PG users - all those "we'll fail if don't IPO 'cause we'll never
have any customers" database-backed websites :)



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: ALTER TABLE DROP COLUMN

From
Hannu Krosing
Date:
Bruce Momjian wrote:
> 
> Seems we have 4 DROP COLUMN ideas:
> 
>         Method                                                  Advantage
>         -----------------------------------------------------------------
> 1       invisible column marked by negative attnum              fast
> 2       invisible column marked by is_dropped column            fast
> 3       make copy of table without column                       col removed
> 4       make new tuples in existing table without column        col removed

IIRC there was a fifth idea, a variation of 2 that would work better
with 
inheritance -

5   all columns have is_real_column attribute that is true for all
coluns 
present in that relation, so situations like

create table tab_a(a_i int);
create table tab_b(b_i int) inherits(tab_a);
alter table tab_a add column c_i int;

can be made to work.

It would also require clients to ignore all missing columns that backend
can 
pass to them as nulls (which is usually quite cheap in bandwith usage)
in 
case of "SELECT **" queries.

We could even rename attno to attid to make folks aware that it is not
be 
assumed to be continuous.

> Folks, we had better choose one and get started.
> 
> Number 1 Hiroshi has ifdef'ed out in the code.  Items 1 and 2 have
> problems with backend code and 3rd party code not seeing the dropped
> columns, or having gaps in the attno numbering.

If we want to make ADD COLUMN to work with inheritance wihout having to 
rewrite every single tuple in both parent and inherited tables, we will 
have to accept the fact that there are caps in in attno numbering.

> Number 3 has problems
> with making it an atomic operation, and number 4 is described below.

Nr 4 has still problems with attno numbering _changing_ during drop
which 
could either be better or worse for client software than having gaps -
in both cases client must be prepared to deal with runtime changes in 
attribute definition.

--------------
Hannu


Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
Don Baccus <dhogaza@pacifier.com> writes:
> Oracle gives you the choice between the "cheating" fast method(s) and
> the "real" slow (really slow?) real method.

> So there's at least real world experience by virtue of example by
> the world's most successful database supplier that user control
> over "hide the column" and "really delete the column" is valuable.

Sure, but you don't need any help from the database to do "really delete
the column".  SELECT INTO... is enough, and it's not even any slower
than the implementations under discussion.

So I'm satisfied if we offer the "hide the column" approach.

Has anyone thought about what happens to table constraints that use the
doomed column?  Triggers, RI rules, yadda yadda?

Has anyone thought about undoing a DELETE COLUMN?  The data is still
there, at least in tuples that have not been updated, so it's not
totally unreasonable.
        regards, tom lane


Re: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 01:14 AM 6/10/00 -0400, Tom Lane wrote:
>Don Baccus <dhogaza@pacifier.com> writes:
>> Oracle gives you the choice between the "cheating" fast method(s) and
>> the "real" slow (really slow?) real method.
>
>> So there's at least real world experience by virtue of example by
>> the world's most successful database supplier that user control
>> over "hide the column" and "really delete the column" is valuable.
>
>Sure, but you don't need any help from the database to do "really delete
>the column".  SELECT INTO... is enough, and it's not even any slower
>than the implementations under discussion.
>
>So I'm satisfied if we offer the "hide the column" approach.

<shrug> I wouldn't put a "real" drop column at the top of my list
of priorities, but there is something to be said for user convenience.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
> > -----Original Message-----
> > From: pgsql-hackers-owner@hub.org 
> > [mailto:pgsql-hackers-owner@hub.org]On Behalf Of Bruce Momjian
> > 
> > Seems we have 4 DROP COLUMN ideas:
> > 
> >     Method                                                  Advantage
> >     -----------------------------------------------------------------
> > 1    invisible column marked by negative attnum        fast
> > 2    invisible column marked by is_dropped column        fast
> > 3    make copy of table without column            col removed
> > 4    make new tuples in existing table without column    col removed
> > 
> > Folks, we had better choose one and get started.  
> > 
> > Number 1 Hiroshi has ifdef'ed out in the code.  Items 1 and 2 have
> > problems with backend code and 3rd party code not seeing the dropped
> > columns,
> 
> Hmm,doesn't  *not seeing*  mean the column is dropped ?

I meant problems of backend code and 3rd party code _seeing_ the dropped
column in pg_attribute.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
>> Seems we have 4 DROP COLUMN ideas:
>> Method                                                  Advantage
>> -----------------------------------------------------------------
>> 1    invisible column marked by negative attnum        fast
>> 2    invisible column marked by is_dropped column        fast
>> 3    make copy of table without column            col removed
>> 4    make new tuples in existing table without column    col removed

Bruce and I talked about this by phone yesterday, and we realized that
none of these are very satisfactory.  #1 and #2 both have the flaw that
applications that examine pg_attribute will probably break: they will
see a sequence of attnum values with gaps in it.  And what should the
rel's relnatts field be set to?  #3 and #4 are better on that point,
but they leave us with the problem of renumbering references to columns
after the dropped one in constraints, rules, PL functions, etc.

Furthermore, there is a closely related problem that none of these
approaches give us much help on: recursive ALTER TABLE ADD COLUMN.
Right now, ADD puts the new column at the end of each table it's added
to, which often means that it gets a different column number in child
tables than in parent tables.  That leads to havoc for pg_dump.

I think the only clean solution is to create a clear distinction between
physical and logical column numbers.  Each pg_attribute tuple would need
two attnum fields, and pg_class would need two relnatts fields as well.
A column once created would never change its physical column number, but
its logical column number might change as a consequence of adding or
dropping columns before it.  ADD COLUMN would ensure that a column added
to child tables receives the same logical column number as it has in the
parent table, thus solving the dump/reload problem.  DROP COLUMN would
assign an invalid logical column number to dropped columns.  They could
be numbered zero except that we'd probably still want a unique index on
attrelid+attnum, and the index would complain.  I'd suggest using
Hiroshi's idea: give a dropped column a logical attnum equal to
-(physical_attnum + offset).

With this approach, internal operations on tuples would all use
physical column numbers, but operations that interface to the outside
world would present a view of only the valid logical columns.  For
example, the parser would only allow logical columns to be referenced
by name; "SELECT *" would expand to valid logical columns in logical-
column-number order; COPY would send or receive valid logical columns
in logical-column-number order; etc.

Stored rules and so forth probably should store physical column numbers
so that they need not be modified during column add/drop.

This would require looking at all the places in the backend to determine
whether they should be working with logical or physical column numbers,
but the design is such that most all places would want to be using
physical numbers, so I don't think it'd be too painful.

Although I'd prefer to give the replacement columns two new names
(eg, "attlnum" and "attpnum") to ensure we find all uses, this would
surely break applications that examine pg_attribute.  For compatibility
we'd have to recycle "attnum" and "relnatts" to indicate logical column
number and logical column count, while adding new fields (say "attpnum"
and "relnpatts") for the physical number and count.

Comments?
        regards, tom lane


Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
> >> Seems we have 4 DROP COLUMN ideas:
> >> Method                                                  Advantage
> >> -----------------------------------------------------------------
> >> 1    invisible column marked by negative attnum        fast
> >> 2    invisible column marked by is_dropped column        fast
> >> 3    make copy of table without column            col removed
> >> 4    make new tuples in existing table without column    col removed
> 
> Bruce and I talked about this by phone yesterday, and we realized that
> none of these are very satisfactory.  #1 and #2 both have the flaw that
> applications that examine pg_attribute will probably break: they will
> see a sequence of attnum values with gaps in it.  And what should the
> rel's relnatts field be set to?  #3 and #4 are better on that point,
> but they leave us with the problem of renumbering references to columns
> after the dropped one in constraints, rules, PL functions, etc.

Yes, glad you summarized.

> 
> Furthermore, there is a closely related problem that none of these
> approaches give us much help on: recursive ALTER TABLE ADD COLUMN.
> Right now, ADD puts the new column at the end of each table it's added
> to, which often means that it gets a different column number in child
> tables than in parent tables.  That leads to havoc for pg_dump.

Also good point.

> 
> I think the only clean solution is to create a clear distinction between
> physical and logical column numbers.  Each pg_attribute tuple would need
> two attnum fields, and pg_class would need two relnatts fields as well.

Excellent idea.

> A column once created would never change its physical column number, but
> its logical column number might change as a consequence of adding or
> dropping columns before it.  ADD COLUMN would ensure that a column added
> to child tables receives the same logical column number as it has in the
> parent table, thus solving the dump/reload problem.  DROP COLUMN would
> assign an invalid logical column number to dropped columns.  They could
> be numbered zero except that we'd probably still want a unique index on
> attrelid+attnum, and the index would complain.  I'd suggest using
> Hiroshi's idea: give a dropped column a logical attnum equal to
> -(physical_attnum + offset).

My guess is that we would need a unique index on the physical attno, not
the logical one.  Multiple zero attno's may be fine.

> 
> With this approach, internal operations on tuples would all use
> physical column numbers, but operations that interface to the outside
> world would present a view of only the valid logical columns.  For
> example, the parser would only allow logical columns to be referenced
> by name; "SELECT *" would expand to valid logical columns in logical-
> column-number order; COPY would send or receive valid logical columns
> in logical-column-number order; etc.

Yes, the only hard part will be taking values supplied in logical order
and moving them into pysical order.  Not too hard with dropped columns,
because they are only gaps, but inheritance would require re-ordering
some of the values supplied by the user.  Not hard, just something
additional that is needed.

> 
> Stored rules and so forth probably should store physical column numbers
> so that they need not be modified during column add/drop.

Yes!

> 
> This would require looking at all the places in the backend to determine
> whether they should be working with logical or physical column numbers,
> but the design is such that most all places would want to be using
> physical numbers, so I don't think it'd be too painful.

Agreed.  Most are physical.

> 
> Although I'd prefer to give the replacement columns two new names
> (eg, "attlnum" and "attpnum") to ensure we find all uses, this would
> surely break applications that examine pg_attribute.  For compatibility
> we'd have to recycle "attnum" and "relnatts" to indicate logical column
> number and logical column count, while adding new fields (say "attpnum"
> and "relnpatts") for the physical number and count.

Can I recommend keeping attnum and relatts as logical, and adding
attheapnum and relheapatts so that it clearly shows these are the heap
values, not the user values.

Great idea.  I was seeing things blocked in every option until your
idea.

Also, my guess is that Hiroshi's #ifdef's mark the places we need to
start looking at.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: ALTER TABLE DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> 
> >> Seems we have 4 DROP COLUMN ideas:
> >> Method                                                  Advantage
> >> -----------------------------------------------------------------
> >> 1    invisible column marked by negative attnum        fast
> >> 2    invisible column marked by is_dropped column        fast
> >> 3    make copy of table without column            col removed
> >> 4    make new tuples in existing table without column    col removed
>

Hmm,I've received no pg-ML mails for more than 1 day.
What's happened with pgsql ML ? 
> Bruce and I talked about this by phone yesterday, and we realized that
> none of these are very satisfactory.  #1 and #2 both have the flaw that
> applications that examine pg_attribute will probably break: they will
> see a sequence of attnum values with gaps in it.  And what should the
> rel's relnatts field be set to?  #3 and #4 are better on that point,
> but they leave us with the problem of renumbering references to columns
> after the dropped one in constraints, rules, PL functions, etc.
> 
> Furthermore, there is a closely related problem that none of these
> approaches give us much help on: recursive ALTER TABLE ADD COLUMN.
> Right now, ADD puts the new column at the end of each table it's added
> to, which often means that it gets a different column number in child
> tables than in parent tables.  That leads to havoc for pg_dump.
>

Inheritance is one of the reason why I didn't take #2. I don't understand 
marking is_dropped is needed or not when pg_attribute is overhauled
for inheritance.
I myself have never wanted to use current inheritance functionality
mainly because of this big flaw. Judging from the recent discussion
about oo(though I don't understand details),the change seems to be
needed in order to make inheritance functionality really useful.  
> I think the only clean solution is to create a clear distinction between
> physical and logical column numbers.  Each pg_attribute tuple would need
> two attnum fields, and pg_class would need two relnatts fields as well.
> A column once created would never change its physical column number, but

I don't understand inheritance well. In the near future wouldn't the
implementation require e.g. attid which is common to all children
of a parent and is never changed ? If so,we would need the third 
attid field which is irrevalent to physical/logical position. If not,
physical column number would be sufficient . 
> its logical column number might change as a consequence of adding or
> dropping columns before it.  ADD COLUMN would ensure that a column added
> to child tables receives the same logical column number as it has in the
> parent table, thus solving the dump/reload problem.  DROP COLUMN would
> assign an invalid logical column number to dropped columns.  They could
> be numbered zero except that we'd probably still want a unique index on
> attrelid+attnum, and the index would complain.  I'd suggest using
> Hiroshi's idea: give a dropped column a logical attnum equal to
> -(physical_attnum + offset).
> 
> With this approach, internal operations on tuples would all use
> physical column numbers, but operations that interface to the outside
> world would present a view of only the valid logical columns.  For
> example, the parser would only allow logical columns to be referenced
> by name; "SELECT *" would expand to valid logical columns in logical-
> column-number order; COPY would send or receive valid logical columns
> in logical-column-number order; etc.
> 
> Stored rules and so forth probably should store physical column numbers
> so that they need not be modified during column add/drop.
> 
> This would require looking at all the places in the backend to determine
> whether they should be working with logical or physical column numbers,
> but the design is such that most all places would want to be using
> physical numbers, so I don't think it'd be too painful.
> 
> Although I'd prefer to give the replacement columns two new names
> (eg, "attlnum" and "attpnum") to ensure we find all uses, this would
> surely break applications that examine pg_attribute.  For compatibility
> we'd have to recycle "attnum" and "relnatts" to indicate logical column
> number and logical column count, while adding new fields (say "attpnum"
> and "relnpatts") for the physical number and count.
>

I agree with you that we would add attpnum and change the meaing of
attnum as logical column number for backward compatibility.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp


Re: ALTER TABLE DROP COLUMN

From
Chris Bitmead
Date:
I don't know if this is one of the 5, but my idea of a good
implementation is to do the fast invisible approach, and then update
individual tuples to the new format the next time they happen to be
UPDATEd.

Therefore, ALTER TABLE DROP COLUMN, followed by UPDATE foo SET bar=bar;
would cause the equiv of (4).

-- 
Chris Bitmead
mailto:chris@bitmead.com
Hannu Krosing wrote:
> 
> Bruce Momjian wrote:
> >
> > Seems we have 4 DROP COLUMN ideas:
> >
> >         Method                                                  Advantage
> >         -----------------------------------------------------------------
> > 1       invisible column marked by negative attnum              fast
> > 2       invisible column marked by is_dropped column            fast
> > 3       make copy of table without column                       col removed
> > 4       make new tuples in existing table without column        col removed
> 
> IIRC there was a fifth idea, a variation of 2 that would work better
> with
> inheritance -
> 
> 5   all columns have is_real_column attribute that is true for all
> coluns
> present in that relation, so situations like
> 
> create table tab_a(a_i int);
> create table tab_b(b_i int) inherits(tab_a);
> alter table tab_a add column c_i int;
> 
> can be made to work.
> 
> It would also require clients to ignore all missing columns that backend
> can
> pass to them as nulls (which is usually quite cheap in bandwith usage)
> in
> case of "SELECT **" queries.
> 
> We could even rename attno to attid to make folks aware that it is not
> be
> assumed to be continuous.
> 
> > Folks, we had better choose one and get started.
> >
> > Number 1 Hiroshi has ifdef'ed out in the code.  Items 1 and 2 have
> > problems with backend code and 3rd party code not seeing the dropped
> > columns, or having gaps in the attno numbering.
> 
> If we want to make ADD COLUMN to work with inheritance wihout having to
> rewrite every single tuple in both parent and inherited tables, we will
> have to accept the fact that there are caps in in attno numbering.
> 
> > Number 3 has problems
> > with making it an atomic operation, and number 4 is described below.
> 
> Nr 4 has still problems with attno numbering _changing_ during drop
> which
> could either be better or worse for client software than having gaps -
> in both cases client must be prepared to deal with runtime changes in
> attribute definition.
> 
> --------------
> Hannu


Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
Chris Bitmead <chris@bitmead.com> writes:
> I don't know if this is one of the 5, but my idea of a good
> implementation is to do the fast invisible approach, and then update
> individual tuples to the new format the next time they happen to be
> UPDATEd.

How would you tell whether a particular tuple has been updated or not?

Furthermore, how would you remember the old tuple format (or formats)
so that you'd know how to make the conversion?

Seems to me this approach would require adding some sort of table
version number to every tuple header, plus storing a complete set of
system catalog entries for every past version of each table's schema.
That's a heck of a high price, in both storage and complexity, for a
feature of dubious value...
        regards, tom lane


Re: ALTER TABLE DROP COLUMN

From
Chris Bitmead
Date:
Yes, it would need to work as you describe below. Such a scheme is used
in several object databases I know of. (Versant being one) where it
works great. It's not just useful for drop column, but also things like
add column with default value. It means that you can add and drop
columns to your hearts content in the blink of an eye, and yet
ultimately not pay the price in terms of storage costs.

But yep, it's a lot more work, and understandable if there isn't
enthusiasm for doing it.

Tom Lane wrote:
> 
> Chris Bitmead <chris@bitmead.com> writes:
> > I don't know if this is one of the 5, but my idea of a good
> > implementation is to do the fast invisible approach, and then update
> > individual tuples to the new format the next time they happen to be
> > UPDATEd.
> 
> How would you tell whether a particular tuple has been updated or not?
> 
> Furthermore, how would you remember the old tuple format (or formats)
> so that you'd know how to make the conversion?
> 
> Seems to me this approach would require adding some sort of table
> version number to every tuple header, plus storing a complete set of
> system catalog entries for every past version of each table's schema.
> That's a heck of a high price, in both storage and complexity, for a
> feature of dubious value...


Re: ALTER TABLE DROP COLUMN

From
Chris Bitmead
Date:
Hiroshi Inoue wrote:

> I don't understand inheritance well. In the near future wouldn't the
> implementation require e.g. attid which is common to all children
> of a parent and is never changed ? If so,we would need the third
> attid field which is irrevalent to physical/logical position. If not,
> physical column number would be sufficient .

We only need something like a unique attid of course if we support
column renaming in child tables. Otherwise the attname is sufficient to
match up child-parent columns.

If/when we support renaming, probably a parent_column_oid in
pg_attribute might be one way to go.

Your idea seems fine Tom.


RE: ALTER TABLE DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Chris Bitmead
> 
> Hiroshi Inoue wrote:
> 
> > I don't understand inheritance well. In the near future wouldn't the
> > implementation require e.g. attid which is common to all children
> > of a parent and is never changed ? If so,we would need the third
> > attid field which is irrevalent to physical/logical position. If not,
> > physical column number would be sufficient .
> 
> We only need something like a unique attid of course if we support
> column renaming in child tables. Otherwise the attname is sufficient to
> match up child-parent columns.
>

There are some objects which keep plans etc as compiled
state.

create table t1 (i1 int4);
create table t2 (i2 int4) inherits t1;
create table t3 (i3 int4) inherits t2;
alter table t1 add column i4 int4;

For each table,the list of (column, logical number, physical number)
would be as follows.

t1   (i1, 1, 1) (i4, 2, 2)
t2   (i1, 1, 1) (i4, 2, 3) (i2, 3, 2)
t3   (i1, 1, 1) (i4, 2, 4) (i2, 3, 2) (i3, 4, 3)

At this point the compilation of 'select * from t1(*?)' would meanselect (physical #1),(physical #2) from t1  +select
(physical#1),(physical #3) from t2  +select (physical #1),(physical #4) from t3 
 

Note that physical # aren't common for column i4.
I've wanted to confirm that above compilation would be OK for 
the (near) future enhancement of inheritance functionality.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp



Re: ALTER TABLE DROP COLUMN

From
Chris Bitmead
Date:
On thinking about it I can definitely see your point about wanting an
attrid that is common across the hierarchy, regardless of compiled
plans. There would be some merit in splitting up pg_attribute into two
parts. One part is common across all classes in the hierarchy, the other
part is specific to one class. Then the oid of the common part is the
attrid you refer to.

However, I'm not sure this directly affects Tom's proposal. Selects from
hierarchies are implemented in terms of a union of all the classes in
the hierarchy. Wouldn't the compiled plan refer to physical ids? In any
case, if UNION can be made to work, I would think select hierarchies
automatically would work too.

Hiroshi Inoue wrote:
> 
> > -----Original Message-----
> > From: Chris Bitmead
> >
> > Hiroshi Inoue wrote:
> >
> > > I don't understand inheritance well. In the near future wouldn't the
> > > implementation require e.g. attid which is common to all children
> > > of a parent and is never changed ? If so,we would need the third
> > > attid field which is irrevalent to physical/logical position. If not,
> > > physical column number would be sufficient .
> >
> > We only need something like a unique attid of course if we support
> > column renaming in child tables. Otherwise the attname is sufficient to
> > match up child-parent columns.
> >
> 
> There are some objects which keep plans etc as compiled
> state.
> 
> create table t1 (i1 int4);
> create table t2 (i2 int4) inherits t1;
> create table t3 (i3 int4) inherits t2;
> alter table t1 add column i4 int4;
> 
> For each table,the list of (column, logical number, physical number)
> would be as follows.
> 
> t1   (i1, 1, 1) (i4, 2, 2)
> t2   (i1, 1, 1) (i4, 2, 3) (i2, 3, 2)
> t3   (i1, 1, 1) (i4, 2, 4) (i2, 3, 2) (i3, 4, 3)
> 
> At this point the compilation of 'select * from t1(*?)' would mean
>         select (physical #1),(physical #2) from t1  +
>         select (physical #1),(physical #3) from t2  +
>         select (physical #1),(physical #4) from t3
> 
> Note that physical # aren't common for column i4.
> I've wanted to confirm that above compilation would be OK for
> the (near) future enhancement of inheritance functionality.
> 
> Regards.
> 
> Hiroshi Inoue
> Inoue@tpf.co.jp


Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> create table t1 (i1 int4);
> create table t2 (i2 int4) inherits t1;
> create table t3 (i3 int4) inherits t2;
> alter table t1 add column i4 int4;

> For each table,the list of (column, logical number, physical number)
> would be as follows.

> t1   (i1, 1, 1) (i4, 2, 2)
> t2   (i1, 1, 1) (i4, 2, 3) (i2, 3, 2)
> t3   (i1, 1, 1) (i4, 2, 4) (i2, 3, 2) (i3, 4, 3)

> At this point the compilation of 'select * from t1(*?)' would mean
>     select (physical #1),(physical #2) from t1  +
>     select (physical #1),(physical #3) from t2  +
>     select (physical #1),(physical #4) from t3 

> Note that physical # aren't common for column i4.

That's no different from the current situation: the planner already must
(and does) adjust column numbers for each derived table while expanding
an inherited query.  It's kind of a pain but hardly an insurmountable
problem.

Currently the matching is done by column name.  We could possibly match
on logical column position instead --- not sure if that's better or
worse.
        regards, tom lane


RE: ALTER TABLE DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> 
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> > create table t1 (i1 int4);
> > create table t2 (i2 int4) inherits t1;
> > create table t3 (i3 int4) inherits t2;
> > alter table t1 add column i4 int4;
> 
> > For each table,the list of (column, logical number, physical number)
> > would be as follows.
> 
> > t1   (i1, 1, 1) (i4, 2, 2)
> > t2   (i1, 1, 1) (i4, 2, 3) (i2, 3, 2)
> > t3   (i1, 1, 1) (i4, 2, 4) (i2, 3, 2) (i3, 4, 3)
> 
> > At this point the compilation of 'select * from t1(*?)' would mean
> >     select (physical #1),(physical #2) from t1  +
> >     select (physical #1),(physical #3) from t2  +
> >     select (physical #1),(physical #4) from t3 
> 
> > Note that physical # aren't common for column i4.
> 
> That's no different from the current situation:

Yes your proposal has no problem currently. I'm only
anxious about oo feature. Recently there has been a
discussion around oo and we would be able to expect 
the progress in the near future. If oo people never mind,
your proposal would be OK.
Regards.

Hiroshi Inoue
Inoue@tpf.co.jp



Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
OK, I am opening this can of worms again.  I personally would like to
see this code activated, even if it does take 2x the disk space to alter
a column.  Hiroshi had other ideas.  Where did we leave this?  We have
one month to decide on a plan.


> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > You can exclusively lock the table, then do a heap_getnext() scan over
> > the entire table, remove the dropped column, do a heap_insert(), then a
> > heap_delete() on the current tuple, making sure to skip over the tuples
> > inserted by the current transaction.  When completed, remove the column
> > from pg_attribute, mark the transaction as committed (if desired), and
> > run vacuum over the table to remove the deleted rows.
> 
> Hmm, that would work --- the new tuples commit at the same instant that
> the schema updates commit, so it should be correct.  You have the 2x
> disk usage problem, but there's no way around that without losing
> rollback ability.
> 
> A potentially tricky bit will be persuading the tuple-reading and tuple-
> writing subroutines to pay attention to different versions of the tuple
> structure for the same table.  I haven't looked to see if this will be
> difficult or not.  If you can pass the TupleDesc explicitly then it
> shouldn't be a problem.
> 
> I'd suggest that the cleanup vacuum *not* be an automatic part of
> the operation; just recommend that people do it ASAP after dropping
> a column.  Consider needing to drop several columns...
> 
>             regards, tom lane
> 
> ************
> 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> OK, I am opening this can of worms again.  I personally would like to
> see this code activated, even if it does take 2x the disk space to alter
> a column.  Hiroshi had other ideas.  Where did we leave this?  We have
> one month to decide on a plan.

I think the plan should be to do nothing for 7.1.  ALTER DROP COLUMN
isn't an especially pressing feature, and so I don't feel that we
should be hustling to squeeze it in just before beta.  We're already
overdue for beta.
        regards, tom lane


RE: ALTER TABLE DROP COLUMN

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Tom Lane
>
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > OK, I am opening this can of worms again.  I personally would like to
> > see this code activated, even if it does take 2x the disk space to alter
> > a column.  Hiroshi had other ideas.  Where did we leave this?  We have
> > one month to decide on a plan.
>
> I think the plan should be to do nothing for 7.1.  ALTER DROP COLUMN
> isn't an especially pressing feature, and so I don't feel that we
> should be hustling to squeeze it in just before beta.  We're already
> overdue for beta.
>

Seems some people expect the implementation in 7.1.
(recent [GENERAL} drop column?)
I could commit my local branch if people don't mind
backward incompatibility.
I've maintained the branch for more than 1 month
and it implements the following TODOs.

* Add ALTER TABLE DROP COLUMN feature
* ALTER TABLE ADD COLUMN to inherited table put column in wrong place
* Prevent column dropping if column is used by foreign key

Comments ?

Hiroshi Inoue

P.S. I've noticed that get_rte_attribute_name() seems to
break my implementation. I'm not sure if I could solve it.



Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> Seems some people expect the implementation in 7.1.
> (recent [GENERAL} drop column?)
> I could commit my local branch if people don't mind
> backward incompatibility.

I've lost track --- is this different from the _DROP_COLUMN_HACK__
code that's already in CVS?  I really really didn't like that
implementation :-(, but I forget what other methods were being
discussed.

> P.S. I've noticed that get_rte_attribute_name() seems to
> break my implementation. I'm not sure if I could solve it.

That would be a problem --- rule dumping depends on that code to
produce correct aliases, so making it work is not optional.
        regards, tom lane


Re: ALTER TABLE DROP COLUMN

From
Hiroshi Inoue
Date:

Tom Lane wrote:

> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> > Seems some people expect the implementation in 7.1.
> > (recent [GENERAL} drop column?)
> > I could commit my local branch if people don't mind
> > backward incompatibility.
>
> I've lost track --- is this different from the _DROP_COLUMN_HACK__
> code that's already in CVS?  I really really didn't like that
> implementation :-(, but I forget what other methods were being
> discussed.
>

My current local trial implementation follows your idea(logical/
physical attribute numbers).


> > P.S. I've noticed that get_rte_attribute_name() seems to
> > break my implementation. I'm not sure if I could solve it.
>
> That would be a problem --- rule dumping depends on that code to
> produce correct aliases, so making it work is not optional.
>

Your change has no problem if logical==physical attribute
numbers.

Regards.

Hiroshi Inoue



Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
>>>> P.S. I've noticed that get_rte_attribute_name() seems to
>>>> break my implementation. I'm not sure if I could solve it.
>> 
>> That would be a problem --- rule dumping depends on that code to
>> produce correct aliases, so making it work is not optional.

> Your change has no problem if logical==physical attribute
> numbers.

But if they're not, what do we do?  Can we define the order of the
alias-name lists as being one or the other numbering?  (Offhand I'd
say it should be logical numbering, but I haven't chased the details.)
If neither of those work, we'll need some more complex datastructure
than a simple list.
        regards, tom lane


Re: ALTER TABLE DROP COLUMN

From
Hiroshi Inoue
Date:

Tom Lane wrote:

> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> >>>> P.S. I've noticed that get_rte_attribute_name() seems to
> >>>> break my implementation. I'm not sure if I could solve it.
> >>
> >> That would be a problem --- rule dumping depends on that code to
> >> produce correct aliases, so making it work is not optional.
>
> > Your change has no problem if logical==physical attribute
> > numbers.
>
> But if they're not, what do we do?  Can we define the order of the
> alias-name lists as being one or the other numbering?  (Offhand I'd
> say it should be logical numbering, but I haven't chased the details.)
> If neither of those work, we'll need some more complex datastructure
> than a simple list.
>

I'm not sure if we could keep invariant attribute numbers.
Though I've used physical attribute numbers as many as possible
in my trial implementation,there's already an exception.
I had to use logical attribute numbers for FieldSelect node.

Regards.

Hiroshi Inoue




Re: ALTER TABLE DROP COLUMN

From
Philip Warner
Date:
At 12:05 6/10/00 +0900, Hiroshi Inoue wrote:
>
>Tom Lane wrote:
>
>> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
>> >>>> P.S. I've noticed that get_rte_attribute_name() seems to
>> >>>> break my implementation. I'm not sure if I could solve it.
>> >>
>> >> That would be a problem --- rule dumping depends on that code to
>> >> produce correct aliases, so making it work is not optional.
>>
>> > Your change has no problem if logical==physical attribute
>> > numbers.
>>
>> But if they're not, what do we do?  Can we define the order of the
>> alias-name lists as being one or the other numbering?  (Offhand I'd
>> say it should be logical numbering, but I haven't chased the details.)
>> If neither of those work, we'll need some more complex datastructure
>> than a simple list.
>>
>
>I'm not sure if we could keep invariant attribute numbers.
>Though I've used physical attribute numbers as many as possible
>in my trial implementation,there's already an exception.
>I had to use logical attribute numbers for FieldSelect node.
>

Not really a useful suggestion at this stage, but it seems to me that
storing plans and/or parse trees is possibly a false economy. Would it be
worth considering storing the relevant SQL (or a parse tree with field &
table names) and compiling the rule in each backend the first time it is
used? (and keep it for the life of the backend).

This would allow underlying view tables to be deleted/added as well as make
the above problem go away. The 'parse tree with names' would also enable
easy construction of dependency information when and if that is implemented...


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
seconded ...

On Fri, 29 Sep 2000, Tom Lane wrote:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > OK, I am opening this can of worms again.  I personally would like to
> > see this code activated, even if it does take 2x the disk space to alter
> > a column.  Hiroshi had other ideas.  Where did we leave this?  We have
> > one month to decide on a plan.
> 
> I think the plan should be to do nothing for 7.1.  ALTER DROP COLUMN
> isn't an especially pressing feature, and so I don't feel that we
> should be hustling to squeeze it in just before beta.  We're already
> overdue for beta.
> 
>             regards, tom lane
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Thu, 5 Oct 2000, Tom Lane wrote:

> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> > Seems some people expect the implementation in 7.1.
> > (recent [GENERAL} drop column?)
> > I could commit my local branch if people don't mind
> > backward incompatibility.

there have been several ideas thrown back and forth ... the best one that
I saw, forgetting who suggested it, had to do with the idea of locking the
table and doing an effective vacuum on that table with a 'row re-write'
happening ...

Basically, move the first 100 rows to the end of the table file, then take
100 and write it to position 0, 101 to position 1, etc ... that way, at
max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
size ... either method is going to lock the file for a period of time, but
one is much more friendly as far as disk space is concerned *plus*, if RAM
is available for this, it might even be something that the backend could
use up to -S blocks of RAM to do it off disk?  If I set -S to 64meg, and
the table is 24Meg in size, it could do it all in memory?




Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
> Basically, move the first 100 rows to the end of the table file, then take
> 100 and write it to position 0, 101 to position 1, etc ... that way, at
> max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
> size ... either method is going to lock the file for a period of time, but
> one is much more friendly as far as disk space is concerned *plus*, if RAM
> is available for this, it might even be something that the backend could
> use up to -S blocks of RAM to do it off disk?  If I set -S to 64meg, and
> the table is 24Meg in size, it could do it all in memory?

Yes, I liked that too.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Basically, move the first 100 rows to the end of the table file, then take
>> 100 and write it to position 0, 101 to position 1, etc ... that way, at
>> max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
>> size ... either method is going to lock the file for a period of time, but
>> one is much more friendly as far as disk space is concerned *plus*, if RAM
>> is available for this, it might even be something that the backend could
>> use up to -S blocks of RAM to do it off disk?  If I set -S to 64meg, and
>> the table is 24Meg in size, it could do it all in memory?

> Yes, I liked that too.

What happens if you crash partway through?

I don't think it's possible to build a crash-robust rewriting ALTER
process that doesn't use 2X disk space: you must have all the old tuples
AND all the new tuples down on disk simultaneously just before you
commit.  The only way around 2X disk space is to adopt some logical
renumbering approach to the columns, so that you can pretend the dropped
column isn't there anymore when it really still is.
        regards, tom lane


Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> Basically, move the first 100 rows to the end of the table file, then take
> >> 100 and write it to position 0, 101 to position 1, etc ... that way, at
> >> max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
> >> size ... either method is going to lock the file for a period of time, but
> >> one is much more friendly as far as disk space is concerned *plus*, if RAM
> >> is available for this, it might even be something that the backend could
> >> use up to -S blocks of RAM to do it off disk?  If I set -S to 64meg, and
> >> the table is 24Meg in size, it could do it all in memory?
> 
> > Yes, I liked that too.
> 
> What happens if you crash partway through?
> 
> I don't think it's possible to build a crash-robust rewriting ALTER
> process that doesn't use 2X disk space: you must have all the old tuples
> AND all the new tuples down on disk simultaneously just before you
> commit.  The only way around 2X disk space is to adopt some logical
> renumbering approach to the columns, so that you can pretend the dropped
> column isn't there anymore when it really still is.

Yes, I liked the 2X disk space, and making the new tuples visible all at
once at the end.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Mon, 9 Oct 2000, Tom Lane wrote:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> Basically, move the first 100 rows to the end of the table file, then take
> >> 100 and write it to position 0, 101 to position 1, etc ... that way, at
> >> max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
> >> size ... either method is going to lock the file for a period of time, but
> >> one is much more friendly as far as disk space is concerned *plus*, if RAM
> >> is available for this, it might even be something that the backend could
> >> use up to -S blocks of RAM to do it off disk?  If I set -S to 64meg, and
> >> the table is 24Meg in size, it could do it all in memory?
> 
> > Yes, I liked that too.
> 
> What happens if you crash partway through?

what happens if you crash partway through a vacuum?

> I don't think it's possible to build a crash-robust rewriting ALTER
> process that doesn't use 2X disk space: you must have all the old
> tuples AND all the new tuples down on disk simultaneously just before
> you commit.  The only way around 2X disk space is to adopt some
> logical renumbering approach to the columns, so that you can pretend
> the dropped column isn't there anymore when it really still is.

how about a combination of the two?  basically, we're gonna want a vacuum
of the table after the alter to clean out those extra columns that we've
marked as 'dead' ... basically, anything that avoids tht whole 2x disk
space option is cool ...




Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Mon, 9 Oct 2000, Bruce Momjian wrote:

> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > >> Basically, move the first 100 rows to the end of the table file, then take
> > >> 100 and write it to position 0, 101 to position 1, etc ... that way, at
> > >> max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
> > >> size ... either method is going to lock the file for a period of time, but
> > >> one is much more friendly as far as disk space is concerned *plus*, if RAM
> > >> is available for this, it might even be something that the backend could
> > >> use up to -S blocks of RAM to do it off disk?  If I set -S to 64meg, and
> > >> the table is 24Meg in size, it could do it all in memory?
> > 
> > > Yes, I liked that too.
> > 
> > What happens if you crash partway through?
> > 
> > I don't think it's possible to build a crash-robust rewriting ALTER
> > process that doesn't use 2X disk space: you must have all the old tuples
> > AND all the new tuples down on disk simultaneously just before you
> > commit.  The only way around 2X disk space is to adopt some logical
> > renumbering approach to the columns, so that you can pretend the dropped
> > column isn't there anymore when it really still is.
> 
> Yes, I liked the 2X disk space, and making the new tuples visible all at
> once at the end.

man, are you ever wishy-washy on this issue, aren't you? :)  you like not
using 2x, you like using 2x ... :)




Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
The Hermit Hacker <scrappy@hub.org> writes:
>> What happens if you crash partway through?

> what happens if you crash partway through a vacuum?

Nothing.  Vacuum is crash-safe.  ALTER TABLE should be too.
        regards, tom lane


Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Mon, 9 Oct 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> >> What happens if you crash partway through?
> 
> > what happens if you crash partway through a vacuum?
> 
> Nothing.  Vacuum is crash-safe.  ALTER TABLE should be too.

Sorry, that's what I meant ... why should marking a column as 'deleted'
and running a 'vacuum' to clean up the physical table be any less
crash-safe?  



Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
> On Mon, 9 Oct 2000, Tom Lane wrote:
> 
> > The Hermit Hacker <scrappy@hub.org> writes:
> > >> What happens if you crash partway through?
> > 
> > > what happens if you crash partway through a vacuum?
> > 
> > Nothing.  Vacuum is crash-safe.  ALTER TABLE should be too.
> 
> Sorry, that's what I meant ... why should marking a column as 'deleted'
> and running a 'vacuum' to clean up the physical table be any less
> crash-safe?  

It is not.  The only downside is 2x disk space to make new versions of
the tuple.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Mon, 9 Oct 2000, Bruce Momjian wrote:

> > On Mon, 9 Oct 2000, Tom Lane wrote:
> > 
> > > The Hermit Hacker <scrappy@hub.org> writes:
> > > >> What happens if you crash partway through?
> > > 
> > > > what happens if you crash partway through a vacuum?
> > > 
> > > Nothing.  Vacuum is crash-safe.  ALTER TABLE should be too.
> > 
> > Sorry, that's what I meant ... why should marking a column as 'deleted'
> > and running a 'vacuum' to clean up the physical table be any less
> > crash-safe?  
> 
> It is not.  The only downside is 2x disk space to make new versions of
> the tuple.

huh?  vacuum moves/cleans up tuples, as well as compresses them, so that
the end result is a smaller table then what it started with, at/with very
little increase in the total size/space needed to perform the vacuum ...

if we reduced vacuum such that it compressed at the field level vs tuple,
we could move a few tuples to the end of the table (crash safe) and then
move N+1 to position 1 minus that extra field.  If we mark the column as
being deleted, then if the system crashes part way through, it should be
possible to continue after the system is brought up, no?




Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
> > > Sorry, that's what I meant ... why should marking a column as 'deleted'
> > > and running a 'vacuum' to clean up the physical table be any less
> > > crash-safe?  
> > 
> > It is not.  The only downside is 2x disk space to make new versions of
> > the tuple.
> 
> huh?  vacuum moves/cleans up tuples, as well as compresses them, so that
> the end result is a smaller table then what it started with, at/with very
> little increase in the total size/space needed to perform the vacuum ...
> 
> if we reduced vacuum such that it compressed at the field level vs tuple,
> we could move a few tuples to the end of the table (crash safe) and then
> move N+1 to position 1 minus that extra field.  If we mark the column as
> being deleted, then if the system crashes part way through, it should be
> possible to continue after the system is brought up, no?

If it crashes in the middle, some rows have the column removed, and some
do not.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Mon, 9 Oct 2000, Bruce Momjian wrote:

> > > > Sorry, that's what I meant ... why should marking a column as 'deleted'
> > > > and running a 'vacuum' to clean up the physical table be any less
> > > > crash-safe?  
> > > 
> > > It is not.  The only downside is 2x disk space to make new versions of
> > > the tuple.
> > 
> > huh?  vacuum moves/cleans up tuples, as well as compresses them, so that
> > the end result is a smaller table then what it started with, at/with very
> > little increase in the total size/space needed to perform the vacuum ...
> > 
> > if we reduced vacuum such that it compressed at the field level vs tuple,
> > we could move a few tuples to the end of the table (crash safe) and then
> > move N+1 to position 1 minus that extra field.  If we mark the column as
> > being deleted, then if the system crashes part way through, it should be
> > possible to continue after the system is brought up, no?
> 
> If it crashes in the middle, some rows have the column removed, and some
> do not.

hrmm .. mvcc uses a timestamp, no?  is there no way of using that
timestamp to determine which columns have/haven't been cleaned up
following a crash?  maybe some way of marking a table as being in a 'drop
column' mode, so that when it gets brought back up again, it is scan'd for
any tuples older then that date?  



Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
The Hermit Hacker <scrappy@hub.org> writes:
>> It is not.  The only downside is 2x disk space to make new versions of
>> the tuple.

> huh?  vacuum moves/cleans up tuples, as well as compresses them, so that
> the end result is a smaller table then what it started with, at/with very
> little increase in the total size/space needed to perform the vacuum ...

Huh? right back at you ;-).  Vacuum is very careful to make sure that
it always has two copies of any tuple it moves.  The reason it's not 2x
disk space is that it only moves tuples to fill free space in existing
pages of the file.  So the moved tuples take up space-that-was-free as
well as the space they were originally in.  But this has nothing
whatever to do with the requirements of ALTER DROP COLUMN --- to be
safe, that must have two copies of every tuple, free space or no free
space.
        regards, tom lane


Re: ALTER TABLE DROP COLUMN

From
Tom Lane
Date:
The Hermit Hacker <scrappy@hub.org> writes:
> hrmm .. mvcc uses a timestamp, no?  is there no way of using that
> timestamp to determine which columns have/haven't been cleaned up
> following a crash?  maybe some way of marking a table as being in a 'drop
> column' mode, so that when it gets brought back up again, it is scan'd for
> any tuples older then that date?  

WAL would provide the framework to do something like that, but I still
say it'd be a bad idea.  What you're describing is
irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
We're trying to get rid of statements that act that way, not add more.

I am not convinced that a 2x penalty for DROP COLUMN is such a huge
problem that we should give up all the normal safety features of SQL
in order to avoid it.  Seems to me that DROP COLUMN is only a big issue
during DB development, when you're usually working with relatively small
amounts of test data anyway.
        regards, tom lane


Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Mon, 9 Oct 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> > hrmm .. mvcc uses a timestamp, no?  is there no way of using that
> > timestamp to determine which columns have/haven't been cleaned up
> > following a crash?  maybe some way of marking a table as being in a 'drop
> > column' mode, so that when it gets brought back up again, it is scan'd for
> > any tuples older then that date?  
> 
> WAL would provide the framework to do something like that, but I still
> say it'd be a bad idea.  What you're describing is
> irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
> We're trying to get rid of statements that act that way, not add more.

Hrmmmm ... this one I can't really argue, or, at least, can't think of
anything right now :(  

> I am not convinced that a 2x penalty for DROP COLUMN is such a huge
> problem that we should give up all the normal safety features of SQL
> in order to avoid it.  Seems to me that DROP COLUMN is only a big
> issue during DB development, when you're usually working with
> relatively small amounts of test data anyway.

Actually, I could see DROP COLUMN being useful in a few other places
... recently, I spent several hours re-structuring a clients database that
had been built by someone else who didn't know what 'relational' means in
RDBMS ... or how about an application developer that decides to
restructure their schema's in a new release and provides an 'upgrade.sql'
script that is designed to do this?  

A good example might be the UDMSearch stuff, where you have tables that
are quite large, but they decide that they want to remove the 'base URL'
component' of one table and put it into another table?  a nice update
script could go something like (pseudo like):

ADD COLUMN base_url int;
INSERT INTO new_table SELECT base_url_text FROM table;
DROP COLUMN base_url_text;

That would make for a very painful upgrade process if I have to go through
the trouble of upgrading my hardware to add more space ...



Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
> The Hermit Hacker <scrappy@hub.org> writes:
> > hrmm .. mvcc uses a timestamp, no?  is there no way of using that
> > timestamp to determine which columns have/haven't been cleaned up
> > following a crash?  maybe some way of marking a table as being in a 'drop
> > column' mode, so that when it gets brought back up again, it is scan'd for
> > any tuples older then that date?  
> 
> WAL would provide the framework to do something like that, but I still
> say it'd be a bad idea.  What you're describing is
> irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
> We're trying to get rid of statements that act that way, not add more.
> 
> I am not convinced that a 2x penalty for DROP COLUMN is such a huge
> problem that we should give up all the normal safety features of SQL
> in order to avoid it.  Seems to me that DROP COLUMN is only a big issue
> during DB development, when you're usually working with relatively small
> amounts of test data anyway.
> 

Bingo!

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Mon, 9 Oct 2000, Bruce Momjian wrote:

> > The Hermit Hacker <scrappy@hub.org> writes:
> > > hrmm .. mvcc uses a timestamp, no?  is there no way of using that
> > > timestamp to determine which columns have/haven't been cleaned up
> > > following a crash?  maybe some way of marking a table as being in a 'drop
> > > column' mode, so that when it gets brought back up again, it is scan'd for
> > > any tuples older then that date?  
> > 
> > WAL would provide the framework to do something like that, but I still
> > say it'd be a bad idea.  What you're describing is
> > irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
> > We're trying to get rid of statements that act that way, not add more.
> > 
> > I am not convinced that a 2x penalty for DROP COLUMN is such a huge
> > problem that we should give up all the normal safety features of SQL
> > in order to avoid it.  Seems to me that DROP COLUMN is only a big issue
> > during DB development, when you're usually working with relatively small
> > amounts of test data anyway.
> > 
> 
> Bingo!

you are jumping on your 'I agree/Bingo' much much too fast :)  




Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
> On Mon, 9 Oct 2000, Bruce Momjian wrote:
> 
> > > The Hermit Hacker <scrappy@hub.org> writes:
> > > > hrmm .. mvcc uses a timestamp, no?  is there no way of using that
> > > > timestamp to determine which columns have/haven't been cleaned up
> > > > following a crash?  maybe some way of marking a table as being in a 'drop
> > > > column' mode, so that when it gets brought back up again, it is scan'd for
> > > > any tuples older then that date?  
> > > 
> > > WAL would provide the framework to do something like that, but I still
> > > say it'd be a bad idea.  What you're describing is
> > > irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
> > > We're trying to get rid of statements that act that way, not add more.
> > > 
> > > I am not convinced that a 2x penalty for DROP COLUMN is such a huge
> > > problem that we should give up all the normal safety features of SQL
> > > in order to avoid it.  Seems to me that DROP COLUMN is only a big issue
> > > during DB development, when you're usually working with relatively small
> > > amounts of test data anyway.
> > > 
> > 
> > Bingo!
> 
> you are jumping on your 'I agree/Bingo' much much too fast :)  

You know this DROP COLUMN is a hot button for me.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: ALTER TABLE DROP COLUMN

From
The Hermit Hacker
Date:
On Mon, 9 Oct 2000, Bruce Momjian wrote:

> > On Mon, 9 Oct 2000, Bruce Momjian wrote:
> > 
> > > > The Hermit Hacker <scrappy@hub.org> writes:
> > > > > hrmm .. mvcc uses a timestamp, no?  is there no way of using that
> > > > > timestamp to determine which columns have/haven't been cleaned up
> > > > > following a crash?  maybe some way of marking a table as being in a 'drop
> > > > > column' mode, so that when it gets brought back up again, it is scan'd for
> > > > > any tuples older then that date?  
> > > > 
> > > > WAL would provide the framework to do something like that, but I still
> > > > say it'd be a bad idea.  What you're describing is
> > > > irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
> > > > We're trying to get rid of statements that act that way, not add more.
> > > > 
> > > > I am not convinced that a 2x penalty for DROP COLUMN is such a huge
> > > > problem that we should give up all the normal safety features of SQL
> > > > in order to avoid it.  Seems to me that DROP COLUMN is only a big issue
> > > > during DB development, when you're usually working with relatively small
> > > > amounts of test data anyway.
> > > > 
> > > 
> > > Bingo!
> > 
> > you are jumping on your 'I agree/Bingo' much much too fast :)  
> 
> You know this DROP COLUMN is a hot button for me.

Ya, but in one email, you appear to agree with me ... then Tom posts a
good point and you jump over to that side ... at least pick a side? :)  I
too wish to see it implemented, I just don't want to have to double my
disk space if at some point I decide to upgrade an application and find
out that they decided to change their schema(s) :(




Re: ALTER TABLE DROP COLUMN

From
Bruce Momjian
Date:
> > > > > I am not convinced that a 2x penalty for DROP COLUMN is such a huge
> > > > > problem that we should give up all the normal safety features of SQL
> > > > > in order to avoid it.  Seems to me that DROP COLUMN is only a big issue
> > > > > during DB development, when you're usually working with relatively small
> > > > > amounts of test data anyway.
> > > > > 
> > > > 
> > > > Bingo!
> > > 
> > > you are jumping on your 'I agree/Bingo' much much too fast :)  
> > 
> > You know this DROP COLUMN is a hot button for me.
> 
> Ya, but in one email, you appear to agree with me ... then Tom posts a
> good point and you jump over to that side ... at least pick a side? :)  I
> too wish to see it implemented, I just don't want to have to double my
> disk space if at some point I decide to upgrade an application and find
> out that they decided to change their schema(s) :(

Sorry, I liked the vacuum idea, but 2x disk usage, not 100 at a time.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: ALTER TABLE DROP COLUMN

From
Don Baccus
Date:
At 07:55 PM 10/9/00 -0300, The Hermit Hacker wrote:

>> I am not convinced that a 2x penalty for DROP COLUMN is such a huge
>> problem that we should give up all the normal safety features of SQL
>> in order to avoid it.  Seems to me that DROP COLUMN is only a big
>> issue during DB development, when you're usually working with
>> relatively small amounts of test data anyway.
>
>Actually, I could see DROP COLUMN being useful in a few other places
>... recently, I spent several hours re-structuring a clients database that
>had been built by someone else who didn't know what 'relational' means in
>RDBMS ... or how about an application developer that decides to
>restructure their schema's in a new release and provides an 'upgrade.sql'
>script that is designed to do this?  

This last example is one reason DROP COLUMN would be a great help to
the OpenACS development effort.

However, upgrades (new releases) are fairly infrequent, and many users of
current versions won't bother unless they've run into toolkit bugs
(same goes for updating PG).  Those who do know that doing an upgrade
will require planning, testing on a system that's not running their
"live" website, and some amount of downtime.

So I don't think a 2x penalty is a huge problem.

>That would make for a very painful upgrade process if I have to go through
>the trouble of upgrading my hardware to add more space ...

For many folks, if eating 2x the size of a single table runs their system out
of disk space, clearly they should've upgraded long, long ago.  An OpenACS
site has hundreds of tables, I can't imagine running my disk space so tight
that I couldn't double the size of one of them long enough to do a DROP
COLUMN.

Obviously, some folks doing other things will have single tables that are
huge,
but after all they can always do what they do now - not drop columns.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: ALTER TABLE DROP COLUMN

From
Hannu Krosing
Date:
The Hermit Hacker wrote:
> 
> On Mon, 9 Oct 2000, Bruce Momjian wrote:
> 
> Ya, but in one email, you appear to agree with me ... then Tom posts a
> good point and you jump over to that side ... at least pick a side? :)  I
> too wish to see it implemented, I just don't want to have to double my
> disk space if at some point I decide to upgrade an application and find
> out that they decided to change their schema(s) :(

As Don already pointed out, if you don't have enough room to double your 
table size you must be running an one-table, append-only application where 
you can only do a very limited set of queries. 

select * from that_table order by some_column_without_an_index; is definitely 
out as it takes many times the space of a that_table anyway.

There _may_ be some cases where 2x is unacceptable, but without radically 
changing tables on-disk structure there is no way to avoid it and still be 
able to rollback or even crash cleanly ;)

-----------------
Hannu