Thread: RFC: Restructuring pg_aggregate

RFC: Restructuring pg_aggregate

From
Tom Lane
Date:
I was originally planning to revise pg_aggregate along the same lines
as pg_proc and so forth: add an aggnamespace column and fix the search
code to be namespace-aware.  But it seemed a tad annoying that standard
function lookups would thereby incur *two* namespace-aware searches:
one in pg_aggregate and one in pg_proc.  Thinking about that, it
occurred to me that there is another way.  Why shouldn't aggregate
functions have entries in pg_proc?  Then one search would cover both
possibilities, and we could eliminate some duplicate code in the parser.

Doing things this way would mean that one could not create an aggregate
function with the same name and input arguments as a regular function
(at least, not in the same namespace).  However, doing so has always
been a bad idea, and it seems like it'd be a step forward not back for
the system to reject it as a naming conflict.

A more serious objection is that this will break client applications
that know about the pg_aggregate table.  However, 7.3 is already going
to force a lot of reprogramming of clients that inspect system tables,
because of the addition of namespaces.  Restructuring pg_aggregate
doesn't seem like it makes life all that much worse.

I would envision this working like so:

In pg_proc: add a boolean column "proisagg" to mark function entries
that are aggregates.  A row for an aggregate function would contain
a pointer to a dummy C function that would just raise an error if
called (which shouldn't ever happen, but just in case some bit of
code doesn't get updated, this would be a good safety check).

In pg_aggregate: remove the columns aggname, aggowner, aggbasetype,
aggfinaltype, and add a column aggfnoid containing the OID of the
aggregate's pg_proc row.  (pg_aggregate itself doesn't need OIDs
anymore, and its only index will be on aggfnoid.)  Essentially this
reduces pg_aggregate to an auxiliary extension of pg_proc, carrying
the fields aggtransfn, aggfinalfn, aggtranstype, agginitval for those
pg_proc rows that need them.

An interesting aspect of this is that the catalog structure would now
be prepared to support aggregate functions with more than one input,
which is a feature we've been asked for occasionally.  I am *not*
volunteering to make that happen right now ... but the catalog
structures would be ready for it.

Comments, objections, better ideas?
        regards, tom lane


Re: RFC: Restructuring pg_aggregate

From
Christopher Kings-Lynne
Date:
> A more serious objection is that this will break client applications
> that know about the pg_aggregate table.  However, 7.3 is already going
> to force a lot of reprogramming of clients that inspect system tables,
> because of the addition of namespaces.  Restructuring pg_aggregate
> doesn't seem like it makes life all that much worse.

How about putting a note in the 7.3 release that tells them not to rely on
sequential attnums in tn pg_attribute anymore.  That should make it easier
to implement column dropping in the future?

Chris




Re: RFC: Restructuring pg_aggregate

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> How about putting a note in the 7.3 release that tells them not to rely on
> sequential attnums in tn pg_attribute anymore.  That should make it easier
> to implement column dropping in the future?

That seems like pure speculation to me, if not outright wrong.  If we
can't renumber the attnums it'll be because the table's tuples still
have data at a particular column position.  Since we'll need to know
the datatype of that data (if only to skip over it correctly), there
will still have to be a pg_attribute entry for the dropped column.
Thus, what people will more likely have to watch out for is pg_attribute
rows marked "deleted" in some fashion.

We are actually not that far away from being able to do DROP COLUMN,
if people don't mind being slow to recover the space used by a dropped
column.  It'd work like this:

1. Add an "attisdropped" boolean to pg_attribute.

2. DROP COLUMN sets this flag and changes attname to something like
"***deleted_NNN".  (Changing attname is only necessary to allow the
same column name to be reused without drawing a unique-index error.)
That's it --- it's done.

3. Column lookup, expansion of *, etc have to be taught to ignore
columns marked attisdropped.

The idea is that the extant data sits there but is invisible.  Inserts
of new rows in the table would always insert a NULL in the dropped
column (which'd fall out more or less for free, there being no way
to tell the system to insert anything else).  Over time, UPDATEs of
extant rows would also replace the dropped data with NULLs.

I suspect there are only about half a dozen key places that would have
to explicitly check attisdropped.  None of the low-level executor
machinery would care at all, since it's dealing with "real" tuples where
the attribute is still there, at least as a NULL.

Hiroshi's "DROP_COLUMN_HACK" was essentially along this line, but
I think he made a representational mistake by trying to change the
attnums of dropped columns to be negative values.  That means that
a lot of low-level places *do* need to know about the dropped-column
convention, else they can't make any sense of tuple layouts.  The
negative-attnum idea might have been a little easier for clients
inspecting pg_attribute to cope with, but in practice I think they'd
need to be taught about dropped columns anyway --- as evidenced by
your remark suggesting that gaps in the sequence of positive attnums
would break clients.
        regards, tom lane

PS: Once you have that, try this on for size: ALTER COLUMN is
ALTER DROP COLUMN;ALTER ADD COLUMN newtype;UPDATE foo SET newcol = coercion_fn(oldcol);

That last couldn't be expressed as an SQL statement because the parser
wouldn't allow access to oldcol, but there's nothing stopping it at the
implementation level.

This approach changes the user-visible column ordering, which'd be
a tad annoying, so probably something based on building a new version of
the table would be better.  But as a quick hack this would be doable.

Actually, given the DROP feature a user could do it for himself:
ALTER ADD COLUMN tempcol newtype;UPDATE foo SET tempcol = coercion_fn(oldcol);ALTER DROP COLUMN oldcol;ALTER RENAME
COLUMNtempcol to oldcol;
 

which seems like an okay approach, especially since it'd allow the
UPDATE computing the new column values to be of arbitrary complexity,
not just a simple coercion of one existing column.


Re: RFC: Restructuring pg_aggregate

From
Peter Eisentraut
Date:
Tom Lane writes:

> Why shouldn't aggregate functions have entries in pg_proc?  Then one
> search would cover both possibilities, and we could eliminate some
> duplicate code in the parser.

Furthermore, we could make the new function privileges apply to aggregates
as well.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: RFC: Restructuring pg_aggregate

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> Why shouldn't aggregate functions have entries in pg_proc?

> Furthermore, we could make the new function privileges apply to aggregates
> as well.

Good point.  Another thing that would fall out for free is that the
aggregate type-coercion rules would become exactly like the function
type-coercion rules; right now they are a tad stupider.
        regards, tom lane


Re: RFC: Restructuring pg_aggregate

From
"Christopher Kings-Lynne"
Date:
> That seems like pure speculation to me, if not outright wrong.  If we
> can't renumber the attnums it'll be because the table's tuples still
> have data at a particular column position.  Since we'll need to know
> the datatype of that data (if only to skip over it correctly), there
> will still have to be a pg_attribute entry for the dropped column.
> Thus, what people will more likely have to watch out for is pg_attribute
> rows marked "deleted" in some fashion.

You know there is a way to do this and not break client compatibility.
Rename the current pg_attribute relation to pg_baseatt or something.  Fix
all references to it in the code.  Create a system view called pg_attribute
which is SELECT * (except attisdropped) FROM pg_baseattr WHERE NOT
attisdropped.

More work though, of course.

> We are actually not that far away from being able to do DROP COLUMN,
> if people don't mind being slow to recover the space used by a dropped
> column.  It'd work like this:

Logical vs. physical column numbers would still be quite handy tho.  If
you're going to break compatibility, may as well do all breaks at once?

> 1. Add an "attisdropped" boolean to pg_attribute.
>
> 2. DROP COLUMN sets this flag and changes attname to something like
> "***deleted_NNN".  (Changing attname is only necessary to allow the
> same column name to be reused without drawing a unique-index error.)
> That's it --- it's done.
>
> 3. Column lookup, expansion of *, etc have to be taught to ignore
> columns marked attisdropped.
>
> The idea is that the extant data sits there but is invisible.  Inserts
> of new rows in the table would always insert a NULL in the dropped
> column (which'd fall out more or less for free, there being no way
> to tell the system to insert anything else).  Over time, UPDATEs of
> extant rows would also replace the dropped data with NULLs.

Would it be possible to modify VACUUM FULL in some way so as to permanently
remove these tuples?  Surely people would like an actual space-saving column
drop?

Chris



Re: RFC: Restructuring pg_aggregate

From
Tom Lane
Date:
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> You know there is a way to do this and not break client compatibility.
> Rename the current pg_attribute relation to pg_baseatt or something.  Fix
> all references to it in the code.  Create a system view called pg_attribute
> which is SELECT * (except attisdropped) FROM pg_baseattr WHERE NOT
> attisdropped.

Wasn't your original concern that the attnums wouldn't be consecutive?
How is this view going to hide that?

> Logical vs. physical column numbers would still be quite handy tho.

But confusing as all hell, at *all* levels of the code ... I've thought
about that quite a bit, and I can't see that we could expect to make it
work without a lot of hard-to-find bugs.  Too many places where it's
not instantly obvious which set of numbers you should be using.
        regards, tom lane


Re: RFC: Restructuring pg_aggregate

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> Hiroshi's "DROP_COLUMN_HACK" was essentially along this line, but
> I think he made a representational mistake by trying to change the
> attnums of dropped columns to be negative values. 

Negative attnums had 2 advantages then. It had a big
advantage that initdb isn't needed. Note that it was
only a trial hack and there was no consensus on the way.
It was very easy to change the implementation to use
attisdropped. OTOH physical/logical attnums approach
needed the change on pg_class, pg_attribute and so
I've never had a chance to open the patch to public. 
It was also more sensitive about oversights of needed 
changes than the attisdropped flag approach. 

> That means that
> a lot of low-level places *do* need to know about the dropped-column
> convention, else they can't make any sense of tuple layouts.

Why ? As you already mentioned, there were not that many places
to be changed.

Well what's changed since then ?

regards,
Hiroshi Inoue


Re: RFC: Restructuring pg_aggregate

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> Tom Lane wrote:
>> That means that
>> a lot of low-level places *do* need to know about the dropped-column
>> convention, else they can't make any sense of tuple layouts.

> Why ? As you already mentioned, there were not that many places
> to be changed.

There are not many places to change if the implementation uses
attisdropped, because we *only* have to hide the existence of the column
at the parser level.  The guts of the system don't know anything funny
is going on; a dropped column looks the same as an undropped one
throughout the executor.  But with negative attnums, even such basic
routines as heap_formtuple have to know about it, no?
        regards, tom lane


Re: RFC: Restructuring pg_aggregate

From
Bruce Momjian
Date:
Hiroshi Inoue wrote:
> Tom Lane wrote:
>
> > Hiroshi's "DROP_COLUMN_HACK" was essentially along this line, but
> > I think he made a representational mistake by trying to change the
> > attnums of dropped columns to be negative values.
>
> Negative attnums had 2 advantages then. It had a big
> advantage that initdb isn't needed. Note that it was
> only a trial hack and there was no consensus on the way.
> It was very easy to change the implementation to use
> attisdropped. OTOH physical/logical attnums approach
> needed the change on pg_class, pg_attribute and so
> I've never had a chance to open the patch to public.
> It was also more sensitive about oversights of needed
> changes than the attisdropped flag approach.
>
> > That means that
> > a lot of low-level places *do* need to know about the dropped-column
> > convention, else they can't make any sense of tuple layouts.
>
> Why ? As you already mentioned, there were not that many places
> to be changed.
>
> Well what's changed since then ?

Here is an old email from me that outlines the idea of having a
physical/logical attribute numbering system, and the advantages.  For
implementation, I thought we could do most of the work by filtering what
the client saw, and let the server just worry about physical numbering,
except for 'SELECT *' expansion.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: RFC: Restructuring pg_aggregate

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > Tom Lane wrote:
> >> That means that
> >> a lot of low-level places *do* need to know about the dropped-column
> >> convention, else they can't make any sense of tuple layouts.
> 
> > Why ? As you already mentioned, there were not that many places
> > to be changed.
> 
> There are not many places to change if the implementation uses
> attisdropped, because we *only* have to hide the existence of the column
> at the parser level.  The guts of the system don't know anything funny
> is going on; a dropped column looks the same as an undropped one
> throughout the executor.  But with negative attnums, even such basic
> routines as heap_formtuple have to know about it, no?

When a tuple descriptor is made, the info of
dropped columns are placed at (their physical
position - 1) index in the same way as ordinary
columns. There are few places where conversions
between negative attnums and the physical positions
are needed.

The following is my posting more than 2 years ago.
What's changed since then.

regards,
Hiroshi Inoue
 I don't want a final implementation this time. What I want is to provide a quick hack for both others and me to judge
whetherthis direction is good or not.
 
 My idea is essentially an invisible column implementation. DROP COLUMN would change the target pg_attribute tuple as
follows..attnum ->  an offset - attnum;atttypid ->  0
 
 We would be able to see where to change by tracking error/ crashes caused by this change.
 I would also change attname to '*already dropped %d' for examle to avoid duplicate attname.


Re: RFC: Restructuring pg_aggregate

From
Hiroshi Inoue
Date:
Bruce Momjian wrote:
> 
> Hiroshi Inoue wrote:
> > Tom Lane wrote:
> >
> > > Hiroshi's "DROP_COLUMN_HACK" was essentially along this line, but
> > > I think he made a representational mistake by trying to change the
> > > attnums of dropped columns to be negative values.
> >
> > Negative attnums had 2 advantages then. It had a big
> > advantage that initdb isn't needed. Note that it was
> > only a trial hack and there was no consensus on the way.
> > It was very easy to change the implementation to use
> > attisdropped. OTOH physical/logical attnums approach
> > needed the change on pg_class, pg_attribute and so
> > I've never had a chance to open the patch to public.
> > It was also more sensitive about oversights of needed
> > changes than the attisdropped flag approach.
> >
> > > That means that
> > > a lot of low-level places *do* need to know about the dropped-column
> > > convention, else they can't make any sense of tuple layouts.
> >
> > Why ? As you already mentioned, there were not that many places
> > to be changed.
> >
> > Well what's changed since then ?
> 
> Here is an old email from me that outlines the idea of having a
> physical/logical attribute numbering system, and the advantages. 

I already tried physical/logical attribute implementation 
pretty long ago. Where are new ideas to solve the problems
that the approach has ?

regards,
Hiroshi Inoue


Re: RFC: Restructuring pg_aggregate

From
Bruce Momjian
Date:
Hiroshi Inoue wrote:
> > > Why ? As you already mentioned, there were not that many places
> > > to be changed.
> > >
> > > Well what's changed since then ?
> > 
> > Here is an old email from me that outlines the idea of having a
> > physical/logical attribute numbering system, and the advantages. 
> 
> I already tried physical/logical attribute implementation 
> pretty long ago. Where are new ideas to solve the problems
> that the approach has ?

Good question.  I am suggesting more than just the drop column fix.  It
could be used for smaller data files to reduce padding, fix for
inheritance problems with ADD COLUMN, and performance of moving
varlena's to the end of the row.

Also, my idea was to have the physical/logical mapping happen closer to
the client, so the backend mostly only deals with physical.  I was
thinking of having the libpq backend communication layer actually do the
reordering of the return results.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFC: Restructuring pg_aggregate

From
Hiroshi Inoue
Date:
Bruce Momjian wrote:
> 
> Hiroshi Inoue wrote:
> > > > Why ? As you already mentioned, there were not that many places
> > > > to be changed.
> > > >
> > > > Well what's changed since then ?
> > >
> > > Here is an old email from me that outlines the idea of having a
> > > physical/logical attribute numbering system, and the advantages.
> >
> > I already tried physical/logical attribute implementation
> > pretty long ago. Where are new ideas to solve the problems
> > that the approach has ?
> 
> Good question.  I am suggesting more than just the drop column fix.  It
> could be used for smaller data files to reduce padding, fix for
> inheritance problems with ADD COLUMN, and performance of moving
> varlena's to the end of the row.
> 
> Also, my idea was to have the physical/logical mapping happen closer to
> the client, so the backend mostly only deals with physical.

If the client has to bear the some part, isn't the invisible
column approach much simpler ?

I've put a pretty much time into DROP COLUMN feature but
I am really disappointed to see the comments in this thread.
What DROP COLUMN has brought me seems only a waste of time.

Possibly I must have introduced either implementation forcibly.

regards,
Hiroshi Inoue


Re: RFC: Restructuring pg_aggregate

From
"Christopher Kings-Lynne"
Date:
> If the client has to bear the some part, isn't the invisible
> column approach much simpler ?
>
> I've put a pretty much time into DROP COLUMN feature but
> I am really disappointed to see the comments in this thread.
> What DROP COLUMN has brought me seems only a waste of time.

I kind of agree with Hiroshi here.  All I want to be able to do is drop
columns from my tables, and reclaim the space.  I've got all sorts of
production tables with columns just sitting there doing nothing, awaiting
the time that I can happily drop them.  It seems to me that whatever we do
will require some kind of client breakage.

Chris



Re: RFC: Restructuring pg_aggregate

From
Bruce Momjian
Date:
Hiroshi Inoue wrote:
> If the client has to bear the some part, isn't the invisible
> column approach much simpler ?
> 
> I've put a pretty much time into DROP COLUMN feature but
> I am really disappointed to see the comments in this thread.
> What DROP COLUMN has brought me seems only a waste of time.
> 
> Possibly I must have introduced either implementation forcibly.

I understand.  I personally think maybe we have been a little to picky
about patches being accepted.  Sometimes when something is not 100%
perfect, we do nothing rather than accept the patch, and replace or
improve it later.  The DROP COLUMN approach you had clearly is one of
them.

Personally, now that we have relfilenode, I think we should implement
drop of columns by just recreating the table without the column.

The big problem with DROP COLUMN was that we couldn't decide on what to
do, so we did nothing, which is probably worse than just choosing one
and doing it.

Our code is littered with my 80% solutions for LIKE optimization,
optimizer statistics, BETWEEN, and lots of other solutions that have met
a need and are now being replaced with better code.  My code was not
great, but I hadn't dont them, PostgreSQL would have had even more
missing features than we do now.  DROP COLUMN is clearly one where we
missed getting something that works and would keep people happy.

As far as my proposal, my idea was not to do it in the client, but
rather to do it just before the data is sent from/to the client.  Maybe
that is a stupid idea.  I never really researched it.  My idea was more
to make the physical/logical column numbers distinct so certain tricks
could be performed.  It wasn't for DROP COLUMN specifically, and in fact
to do DROP COLUMN with my code, there would have to be more code similar
to what you had where clients would see a column and have to skip it.  I
was focusing more on physical/logical to enable other features.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFC: Restructuring pg_aggregate

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> > If the client has to bear the some part, isn't the invisible
> > column approach much simpler ?
> >
> > I've put a pretty much time into DROP COLUMN feature but
> > I am really disappointed to see the comments in this thread.
> > What DROP COLUMN has brought me seems only a waste of time.
> 
> I kind of agree with Hiroshi here.  All I want to be able to do is drop
> columns from my tables, and reclaim the space.  I've got all sorts of
> production tables with columns just sitting there doing nothing, awaiting
> the time that I can happily drop them.  It seems to me that whatever we do
> will require some kind of client breakage.

Actually, what we need to do to reclaim space is to enable table
recreation without the column, now that we have relfilenode for file
renaming.  It isn't hard to do, but no one has focused on it.  I want to
focus on it, but have not had the time, obviously, and would be very
excited to assist someone else.

Hiroshi's fine idea of marking certain columns as unused would not have
reclaimed the missing space, just as my idea of physical/logical column
distinction would not reclaim the space either.  Again, my
physical/logical idea is more for fixing other problems and
optimization, not DROP COLUMN.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFC: Restructuring pg_aggregate

From
"Christopher Kings-Lynne"
Date:
> Actually, what we need to do to reclaim space is to enable table
> recreation without the column, now that we have relfilenode for file
> renaming.  It isn't hard to do, but no one has focused on it.  I want to
> focus on it, but have not had the time, obviously, and would be very
> excited to assist someone else.

I'm happy to help - depends if it's within my skill level or not tho.  Most
of the time the problem I have is finding where to make the changes, not
actually making the changes themselves.  So, count me in.

> Hiroshi's fine idea of marking certain columns as unused would not have
> reclaimed the missing space, just as my idea of physical/logical column
> distinction would not reclaim the space either.  Again, my
> physical/logical idea is more for fixing other problems and
> optimization, not DROP COLUMN.

Question: Is it _possible_ to reclaim the space during a VACUUM FULL?  I do
not know enough about the file format to know this.  What happens if the
VACUUM is stopped halfway thru reclaiming a column in a table?

Bruce: WRT modifying libpq to do the translation - won't this cause probs
for JDBC and ODBC people?

Chris



Re: RFC: Restructuring pg_aggregate

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> > Actually, what we need to do to reclaim space is to enable table
> > recreation without the column, now that we have relfilenode for file
> > renaming.  It isn't hard to do, but no one has focused on it.  I want to
> > focus on it, but have not had the time, obviously, and would be very
> > excited to assist someone else.
> 
> I'm happy to help - depends if it's within my skill level or not tho.  Most
> of the time the problem I have is finding where to make the changes, not
> actually making the changes themselves.  So, count me in.

OK, let me mention that I have had great success with chat sessions with
PostgreSQL developers.  They can code and ask questions and I can answer
quickly.  Seems to be speeding things along for some people.  I am:AIM    bmomjianICQ    151255111Yahoo    bmomjianMSN
 root@candle.pha.pa.us
 

I am also on the PostgreSQL IRC channel.  As far as where to start, I
think the CLUSTER command would be a good start because it just reorders
the existing table.  Then DROP COLUMN can come out of that by removing
the column during the copy, and removing mention of the column from
pg_attribute, and of course renumbering the gap.

> > Hiroshi's fine idea of marking certain columns as unused would not have
> > reclaimed the missing space, just as my idea of physical/logical column
> > distinction would not reclaim the space either.  Again, my
> > physical/logical idea is more for fixing other problems and
> > optimization, not DROP COLUMN.
> 
> Question: Is it _possible_ to reclaim the space during a VACUUM FULL?  I do
> not know enough about the file format to know this.  What happens if the
> VACUUM is stopped halfway thru reclaiming a column in a table?

Not really.  I moves only whole tuples, and only certain ones.

> Bruce: WRT modifying libpq to do the translation - won't this cause probs
> for JDBC and ODBC people?

No, not in libpq, but rather in backend/libpq, the backend part of the
connection.  My idea is for the user to think things are in a different
order in the row than they actually appear on disk.  I haven't really
researched it enough to understand its validity.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFC: Restructuring pg_aggregate

From
Hiroshi Inoue
Date:
Christopher Kings-Lynne wrote:
> 
> > If the client has to bear the some part, isn't the invisible
> > column approach much simpler ?
> >
> > I've put a pretty much time into DROP COLUMN feature but
> > I am really disappointed to see the comments in this thread.
> > What DROP COLUMN has brought me seems only a waste of time.
> 
> I kind of agree with Hiroshi here.  All I want to be able to do is drop
> columns from my tables, and reclaim the space.  I've got all sorts of
> production tables with columns just sitting there doing nothing, awaiting
> the time that I can happily drop them.

> It seems to me that whatever we do
> will require some kind of client breakage.

Physical/logical attnum approach was mainly to not break
clients. I implemented it on trial but the implementation
was hard to maintain unfortunately. It's pretty difficult
to decide whether the number is physical or logical in
many cases.

regards,
Hiroshi Inoue


Re: RFC: Restructuring pg_aggregate

From
Bruce Momjian
Date:
Hiroshi Inoue wrote:
> > It seems to me that whatever we do
> > will require some kind of client breakage.
> 
> Physical/logical attnum approach was mainly to not break
> clients. I implemented it on trial but the implementation
> was hard to maintain unfortunately. It's pretty difficult
> to decide whether the number is physical or logical in
> many cases.

How many cases do we have that use logical numering?  Hiroshi, I know
you are the expert on this.  I know 'SELECT *' uses it, but are their
other places that need to know about the logical ordering of the
columns?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFC: Restructuring pg_aggregate

From
Hiroshi Inoue
Date:
Bruce Momjian wrote:
> 
> Hiroshi Inoue wrote:
> > If the client has to bear the some part, isn't the invisible
> > column approach much simpler ?
> >
> > I've put a pretty much time into DROP COLUMN feature but
> > I am really disappointed to see the comments in this thread.
> > What DROP COLUMN has brought me seems only a waste of time.
> >
> > Possibly I must have introduced either implementation forcibly.
> 
> I understand.  I personally think maybe we have been a little to picky
> about patches being accepted.  Sometimes when something is not 100%
> perfect, we do nothing rather than accept the patch, and replace or
> improve it later.  The DROP COLUMN approach you had clearly is one of
> them.

I don't complain about the rejection of my patch.
If it has an essential flaw we had better reject it.
What I'm complaining is why it is OK now whereas
there's nothing new.

regards,
Hiroshi Inoue


Re: RFC: Restructuring pg_aggregate

From
Bruce Momjian
Date:
Hiroshi Inoue wrote:
> Bruce Momjian wrote:
> > 
> > Hiroshi Inoue wrote:
> > > If the client has to bear the some part, isn't the invisible
> > > column approach much simpler ?
> > >
> > > I've put a pretty much time into DROP COLUMN feature but
> > > I am really disappointed to see the comments in this thread.
> > > What DROP COLUMN has brought me seems only a waste of time.
> > >
> > > Possibly I must have introduced either implementation forcibly.
> > 
> > I understand.  I personally think maybe we have been a little to picky
> > about patches being accepted.  Sometimes when something is not 100%
> > perfect, we do nothing rather than accept the patch, and replace or
> > improve it later.  The DROP COLUMN approach you had clearly is one of
> > them.
> 
> I don't complain about the rejection of my patch.
> If it has an essential flaw we had better reject it.
> What I'm complaining is why it is OK now whereas
> there's nothing new.

Sure, I understand.

My physical/logical idea may have the same problems as your DROP COLUMN
idea, and may be as rapidly rejected.  I am just throwing it out for
discussion.

I am not sure I like it.  :-)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFC: Restructuring pg_aggregate

From
Christopher Kings-Lynne
Date:
> Actually, what we need to do to reclaim space is to enable table
> recreation without the column, now that we have relfilenode for file
> renaming.  It isn't hard to do, but no one has focused on it.  I want to
> focus on it, but have not had the time, obviously, and would be very
> excited to assist someone else.
>
> Hiroshi's fine idea of marking certain columns as unused would not have
> reclaimed the missing space, just as my idea of physical/logical column
> distinction would not reclaim the space either.  Again, my
> physical/logical idea is more for fixing other problems and
> optimization, not DROP COLUMN.

Hmmm.  Personally, I think that a DROP COLUMN that cannot reclaim space is
kinda useless - you may as well just use a view!!!

So how would this occur?:

1. Lock target table for writing (allow reads)
2. Begin a table scan on target table, writing  a new file with a particular filenode
3. Delete the attribute row from pg_attribute
4. Point the table in the catalog to the new filenode
5. Release locks
6. Commit transaction
7. Delete orhpan filenode

i. Upon postmaster startup, remove any orphaned filenodes

The real problem here is the fact that there are now missing attnos in
pg_attribute.  Either that's handled or we renumber the attnos - which is
also quite hard?

This, of course, suffers from the double size data problem - but I believe
that it does not matter - we just need to document it.

Interestingly enough, Oracle support

ALTER TABLE foo SET UNUSED col;

Which invalidates the attribute entry, and:

ALTER TABLE foo DROP col CHECKPOINT 1000;

Which actually reclaims the space.  The optional CHECKPOINT [n] clause
tells Oracle to do a checkpoint every [n] rows.

"Checkpointing cuts down the amount of undo logs accumulated during the
drop column operation to avoid running out of rollback segment space.
However, if this statement is interrupted after a checkpoint has been
applied, the table remains in an unusable state. While the table is
unusable, the only operations allowed on it are DROP TABLE, TRUNCATE
TABLE, and ALTER TABLE DROP COLUMNS CONTINUE (described below). "

Chris




Re: RFC: Restructuring pg_aggregate

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> The real problem here is the fact that there are now missing attnos in
> pg_attribute.  Either that's handled or we renumber the attnos - which is
> also quite hard?

Updating pg_attribute per se is not so hard --- just store new copies of
all the rows for the table.  However, propagating the changes into other
places could be quite painful (I'm thinking of column numbers in stored
constraints, rules, etc).

It seems to me that reducing the column to NULLs already gets you the
majority of the space savings.  I don't think there is a case to be made
that getting back that last bit is worth the pain involved, either in
implementation effort or direct runtime costs (do you really want a DROP
COLUMN to force an immediate rewrite of the whole table?)
        regards, tom lane


Re: RFC: Restructuring pg_aggregate

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> > Actually, what we need to do to reclaim space is to enable table
> > recreation without the column, now that we have relfilenode for file
> > renaming.  It isn't hard to do, but no one has focused on it.  I want to
> > focus on it, but have not had the time, obviously, and would be very
> > excited to assist someone else.
> >
> > Hiroshi's fine idea of marking certain columns as unused would not have
> > reclaimed the missing space, just as my idea of physical/logical column
> > distinction would not reclaim the space either.  Again, my
> > physical/logical idea is more for fixing other problems and
> > optimization, not DROP COLUMN.
> 
> Hmmm.  Personally, I think that a DROP COLUMN that cannot reclaim space is
> kinda useless - you may as well just use a view!!!

Yep, kind of a problem.  It is a tradeoff between double diskspace/speed
and removing column from disk.  I guess that's why Oracle has both.

> 
> So how would this occur?:
> 
> 1. Lock target table for writing (allow reads)
> 2. Begin a table scan on target table, writing
>    a new file with a particular filenode
> 3. Delete the attribute row from pg_attribute
> 4. Point the table in the catalog to the new filenode
> 5. Release locks
> 6. Commit transaction
> 7. Delete orhpan filenode

Yep, something like that.  CLUSTER is a good start.  DROP COLUMN just
deals with the attno too.  You would have to renumber them to fill the
gap.

> i. Upon postmaster startup, remove any orphaned filenodes

Actually, we don't have a good solution for finding orphaned filenodes
right now.  I do have some code that tries to do this as part of VACUUM
but it was not 100% perfect, so it was rejected.  I am willing to open
the discussion to see if a perfect solution can be found.


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFC: Restructuring pg_aggregate

From
Bruce Momjian
Date:
Tom Lane wrote:
> Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> > The real problem here is the fact that there are now missing attnos in
> > pg_attribute.  Either that's handled or we renumber the attnos - which is
> > also quite hard?
> 
> Updating pg_attribute per se is not so hard --- just store new copies of
> all the rows for the table.  However, propagating the changes into other
> places could be quite painful (I'm thinking of column numbers in stored
> constraints, rules, etc).
> 
> It seems to me that reducing the column to NULLs already gets you the
> majority of the space savings.  I don't think there is a case to be made
> that getting back that last bit is worth the pain involved, either in
> implementation effort or direct runtime costs (do you really want a DROP
> COLUMN to force an immediate rewrite of the whole table?)

That is an excellent point about having to fix all the places that refer
to attno.  In fact, we have been moving away from attname references to
attno references for a while, so it only gets worse.  Tom is also
correct that setting it to NULL removes the problem of disk space usage
quite easily.

That only leaves the problem of having gaps in the pg_attribute for that
relation, and as I remember, that was the problem for Hiroshi's DROP
COLUMN change, but at this point, after years of delay with no great
solution on the horizon, we may as well just get this working.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFC: Restructuring pg_aggregate

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> Why shouldn't aggregate functions have entries in pg_proc?  Then one
>> search would cover both possibilities, and we could eliminate some
>> duplicate code in the parser.

> Furthermore, we could make the new function privileges apply to aggregates
> as well.

GRANT/REVOKE FUNCTION will now work on aggregate functions too (is there
any value in making a variant syntax for aggregates?).  However, I
didn't implement enforcement of the EXECUTE privilege yet.  I was
slightly bemused to notice that your implementation of it for regular
functions tests the privilege at plan startup but doesn't actually throw
the error until the function is called.  What's the point of that?
Seems like we might as well throw the error in init_fcache and not
bother with storing a boolean.
        regards, tom lane


Re: RFC: Restructuring pg_aggregate

From
Peter Eisentraut
Date:
Tom Lane writes:

> I was slightly bemused to notice that your implementation of it for
> regular functions tests the privilege at plan startup but doesn't
> actually throw the error until the function is called.  What's the
> point of that? Seems like we might as well throw the error in
> init_fcache and not bother with storing a boolean.

Yeah, it's a bit funny.  I wanted to keep the fcache code from doing
anything not to do with caching, and I wanted to keep the permission check
in the executor, like it is for tables.

There were a couple of cases, which I have not fully explored yet, for
which this seemed like a good idea, such as some functions being in the
plan but not being executed, or the permission check being avoided for
some functions (e.g., cast functions).

-- 
Peter Eisentraut   peter_e@gmx.net



Re: RFC: Restructuring pg_aggregate

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> I was slightly bemused to notice that your implementation of it for
>> regular functions tests the privilege at plan startup but doesn't
>> actually throw the error until the function is called.  What's the
>> point of that? Seems like we might as well throw the error in
>> init_fcache and not bother with storing a boolean.

> Yeah, it's a bit funny.  I wanted to keep the fcache code from doing
> anything not to do with caching, and I wanted to keep the permission check
> in the executor, like it is for tables.

Well, init_fcache is part of executor startup, so that seems reasonably
parallel to the table case.  I do not buy the idea that we shouldn't
throw an error if the function happens not to be called --- that makes
the behavior dependent on both planner choices and data conditions,
which seems like a bad idea.  For comparison, we *will* throw a
permission error on tables even if the actual execution of the plan
turns out never to read a single row from that table.

(After a bit of code reading --- actually, at present init_fcache
doesn't get called until first use of the function, so it's a
distinction without a difference.  But I have been thinking about
revising this so that function caches are set up during plan
initialization, which is why I'm questioning the point now.)

> There were a couple of cases, which I have not fully explored yet, for
> which this seemed like a good idea, such as some functions being in the
> plan but not being executed, or the permission check being avoided for
> some functions (e.g., cast functions).

Cast functions?  Not sure that I see the point of excluding them.

What I'm inclined to do for aggregates is to check and throw the error
(if any) during ExecInitAgg, ie, plan startup for the Agg plan node.
I was just a tad startled to notice that the implementation wasn't
parallel for plain functions ...
        regards, tom lane


Re: RFC: Restructuring pg_aggregate

From
Hiroshi Inoue
Date:
Christopher Kings-Lynne wrote:
> 
> Hmmm.  Personally, I think that a DROP COLUMN that cannot reclaim space is
> kinda useless - you may as well just use a view!!!
> 
> So how would this occur?:
> 
> 1. Lock target table for writing (allow reads)
> 2. Begin a table scan on target table, writing
>    a new file with a particular filenode
> 3. Delete the attribute row from pg_attribute
> 4. Point the table in the catalog to the new filenode
> 5. Release locks
> 6. Commit transaction
> 7. Delete orhpan filenode
> 
> i. Upon postmaster startup, remove any orphaned filenodes
> 
> The real problem here is the fact that there are now missing attnos in
> pg_attribute.  Either that's handled or we renumber the attnos - which is
> also quite hard?

The attnos should be renumbered and it's easy.
But the above seems only 20% of the total implementation.
If the attnos are renumbered, all objects which refer to 
the numbers must be invalidated or re-compiled ...
For example the parameters of foreign key constraints
triggers are consist of relname and colnames currently.
There has been a proposal that change to use relid or
column numbers instead. Certainly it makes RENAME happy
but DROP COLUMN unhappy. If there's a foreign key a_rel/1/3
and the second column of the relation is dropped the
parameter must be changed to be a_rel/1/2. If neither
foreign key stuff nor DROP COLUMN take the other into
account, the consistency is easily broken. 

regards,
Hiroshi Inoue


Re: RFC: Restructuring pg_aggregate

From
Bruce Momjian
Date:
Hiroshi Inoue wrote:
> Christopher Kings-Lynne wrote:
> > 
> > Hmmm.  Personally, I think that a DROP COLUMN that cannot reclaim space is
> > kinda useless - you may as well just use a view!!!
> > 
> > So how would this occur?:
> > 
> > 1. Lock target table for writing (allow reads)
> > 2. Begin a table scan on target table, writing
> >    a new file with a particular filenode
> > 3. Delete the attribute row from pg_attribute
> > 4. Point the table in the catalog to the new filenode
> > 5. Release locks
> > 6. Commit transaction
> > 7. Delete orhpan filenode
> > 
> > i. Upon postmaster startup, remove any orphaned filenodes
> > 
> > The real problem here is the fact that there are now missing attnos in
> > pg_attribute.  Either that's handled or we renumber the attnos - which is
> > also quite hard?
> 
> The attnos should be renumbered and it's easy.
> But the above seems only 20% of the total implementation.
> If the attnos are renumbered, all objects which refer to 
> the numbers must be invalidated or re-compiled ...
> For example the parameters of foreign key constraints
> triggers are consist of relname and colnames currently.
> There has been a proposal that change to use relid or
> column numbers instead. Certainly it makes RENAME happy
> but DROP COLUMN unhappy. If there's a foreign key a_rel/1/3
> and the second column of the relation is dropped the
> parameter must be changed to be a_rel/1/2. If neither
> foreign key stuff nor DROP COLUMN take the other into
> account, the consistency is easily broken. 

I think that is why Tom was suggesting making all the column values NULL
and removing the pg_attribute row for the column.  With a NULL value, it
doesn't take up any room in the tuple, and with the pg_attribute column
gone, no one will see that row.  The only problem is the gap in attno
numbering.  How big a problem is that?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFC: Restructuring pg_aggregate

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I think that is why Tom was suggesting making all the column values NULL
> and removing the pg_attribute row for the column.

That was not my suggestion.

> With a NULL value, it
> doesn't take up any room in the tuple, and with the pg_attribute column
> gone, no one will see that row.  The only problem is the gap in attno
> numbering.  How big a problem is that?

You can't do it that way unless you're intending to rewrite all rows of
the relation before committing the ALTER; which would be the worst of
both worlds.  The pg_attribute row *must* be retained to show the
datatype of the former column, so that we can correctly skip over it
in tuples where the column isn't yet nulled out.

Hiroshi did this by renumbering the attnum; I propose leaving attnum
alone and instead adding an attisdropped flag.  That would avoid
creating a gap in the column numbers, but either way is likely to affect
some applications that inspect pg_attribute.
        regards, tom lane


Re: RFC: Restructuring pg_aggregate

From
Hiroshi Inoue
Date:
Bruce Momjian wrote:
> 
> Hiroshi Inoue wrote:
> > Christopher Kings-Lynne wrote:
> > >
> I think that is why Tom was suggesting making all the column values NULL
> and removing the pg_attribute row for the column.  With a NULL value, it
> doesn't take up any room in the tuple, and with the pg_attribute column
> gone, no one will see that row.  The only problem is the gap in attno
> numbering.  How big a problem is that?

There's no problem with applications which don't inquire
of system catalogs(pg_attribute). Unfortunately we have 
been very tolerant of users' access on system tables and
there would be pretty many applications which inquire of
pg_attribute.

regards,
Hiroshi Inoue


Re: RFC: Restructuring pg_aggregate

From
Dave Page
Date:

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] 
> Sent: 12 April 2002 03:54
> To: Bruce Momjian
> Cc: Hiroshi Inoue; Christopher Kings-Lynne; 
> pgsql-hackers@postgresql.org
> Subject: Re: RFC: Restructuring pg_aggregate 
> 
> 
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I think that is why Tom was suggesting making all the column values 
> > NULL and removing the pg_attribute row for the column.
> 
> That was not my suggestion.
> 
> > With a NULL value, it
> > doesn't take up any room in the tuple, and with the pg_attribute 
> > column gone, no one will see that row.  The only problem is 
> the gap in 
> > attno numbering.  How big a problem is that?
> 
> You can't do it that way unless you're intending to rewrite 
> all rows of the relation before committing the ALTER; which 
> would be the worst of both worlds.  The pg_attribute row 
> *must* be retained to show the datatype of the former column, 
> so that we can correctly skip over it in tuples where the 
> column isn't yet nulled out.
> 
> Hiroshi did this by renumbering the attnum; I propose leaving 
> attnum alone and instead adding an attisdropped flag.  That 
> would avoid creating a gap in the column numbers, but either 
> way is likely to affect some applications that inspect pg_attribute.

Applications like pgAdmin that inspect pg_attribute are being seriously
hacked to incorporate schema support anyway for 7.3. Personnally I'd be glad
to spend some time re-coding to allow for this, just to not have to answer
the numerous 'how do I drop a column' emails I get reguarly.

Regards, Dave.


Re: RFC: Restructuring pg_aggregate

From
"Christopher Kings-Lynne"
Date:
> Updating pg_attribute per se is not so hard --- just store new copies of
> all the rows for the table.  However, propagating the changes into other
> places could be quite painful (I'm thinking of column numbers in stored
> constraints, rules, etc).
>
> It seems to me that reducing the column to NULLs already gets you the
> majority of the space savings.  I don't think there is a case to be made
> that getting back that last bit is worth the pain involved, either in
> implementation effort or direct runtime costs (do you really want a DROP
> COLUMN to force an immediate rewrite of the whole table?)

OK, sounds fair.  However, is there a more aggressive way of reclaiming the
space?  The problem with updating all the rows to null for that column is
that the on-disk size is doubled anyway, right?  So, could a VACUUM FULL
process do the nulling for us?  Vacuum works outside of normal transaction
constraints anyway...?

Also, it seems to me that at some point we are forced to break client
compatibility.  Either we add attisdropped field to pg_attribute, or we use
Hiroshi's (-1 * attnum - offset) idea.  Both Tom and Hiroshi have good
reasons for each of these - would it be possible for you guys to post with
your reasons for and against both the techniques.  I just want to get to an
implementation specification we all agree on that can be written up and then
the coding can proceed.  Maybe we should add it to the 'Postgres Major
Projects' page - and remove those old ones that have already been
implemented.

Chris




Re: DROP COLUMN (was RFC: Restructuring pg_aggregate)

From
Tom Lane
Date:
[ way past time to change the title of this thread ]

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> OK, sounds fair.  However, is there a more aggressive way of reclaiming the
> space?  The problem with updating all the rows to null for that column is
> that the on-disk size is doubled anyway, right?  So, could a VACUUM FULL
> process do the nulling for us?  Vacuum works outside of normal transaction
> constraints anyway...?

No, VACUUM has the same transactional constraints as everyone else
(unless you'd like a crash during VACUUM to trash your table...)

I do not think that we necessarily need to provide a special mechanism
for this at all.  The docs for DROP COLUMN could simply explain that
the DROP itself doesn't reclaim the space, but that the space will be
reclaimed over time as extant rows are updated or deleted.  If you want
to hurry the process along you could doUPDATE table SET othercol = othercolVACUUM FULL
to force all the rows to be updated and then reclaim space.  But given
the peak-space-is-twice-as-much behavior, this is not obviously a win.
I'd sure object to an implementation that *forced* that approach on me,
whether during DROP itself or the next VACUUM.

> Also, it seems to me that at some point we are forced to break client
> compatibility.  Either we add attisdropped field to pg_attribute, or we use
> Hiroshi's (-1 * attnum - offset) idea.  Both Tom and Hiroshi have good
> reasons for each of these - would it be possible for you guys to post with
> your reasons for and against both the techniques.

Er, didn't we do that already?
        regards, tom lane


Re: DROP COLUMN (was RFC: Restructuring pg_aggregate)

From
Hannu Krosing
Date:
On Sat, 2002-04-13 at 17:29, Tom Lane wrote:
> [ way past time to change the title of this thread ]
> 
> "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> > OK, sounds fair.  However, is there a more aggressive way of reclaiming the
> > space?  The problem with updating all the rows to null for that column is
> > that the on-disk size is doubled anyway, right?  So, could a VACUUM FULL
> > process do the nulling for us?  Vacuum works outside of normal transaction
> > constraints anyway...?
> 
> No, VACUUM has the same transactional constraints as everyone else
> (unless you'd like a crash during VACUUM to trash your table...)

But can't it do the SET TO NULL thing if it knows that the transaction
that dropped the column has committed. 

This could probably even be done in the light version of vacuum with a
special flag (VACUUM RECLAIM). 

Of course running this this makes sense only if the dropped column had
some significant amount of data .

> I do not think that we necessarily need to provide a special mechanism
> for this at all.  The docs for DROP COLUMN could simply explain that
> the DROP itself doesn't reclaim the space, but that the space will be
> reclaimed over time as extant rows are updated or deleted.  If you want
> to hurry the process along you could do
>     UPDATE table SET othercol = othercol
>     VACUUM FULL

If only we could do it in namageable chunks:

FOR i IN 0 TO (size(table)/chunk) DO   UPDATE table SET othercol = othercol OFFSET i*chunk LIMIT chunk   VACUUM FULL;
END FOR;

or even better - "VACUUM FULL OFFSET i*chunk LIMIT chunk" and then make
chunk == 1 :)

--------------
Hannu



Re: DROP COLUMN (was RFC: Restructuring pg_aggregate)

From
Tom Lane
Date:
Hannu Krosing <hannu@tm.ee> writes:
>> No, VACUUM has the same transactional constraints as everyone else
>> (unless you'd like a crash during VACUUM to trash your table...)

> But can't it do the SET TO NULL thing if it knows that the transaction
> that dropped the column has committed. 

Hmm, you're thinking of allowing VACUUM to overwrite tuples in-place?
Strikes me as unsafe, but I'm not really sure.

In any case it's not that easy.  If the column is wide enough
that reclaiming its space is actually worth doing, then presumably
most of its entries are just TOAST links, and what has to be done is
not just rewrite the main tuple but mark the TOAST rows deleted.
This is not something that VACUUM does now; I'd be rather concerned
about the locking implications (especially for lightweight VACUUM).
        regards, tom lane


Re: DROP COLUMN (was RFC: Restructuring pg_aggregate)

From
"Christopher Kings-Lynne"
Date:
> No, VACUUM has the same transactional constraints as everyone else
> (unless you'd like a crash during VACUUM to trash your table...)

Seriously, you can run VACUUM in a transaction and rollback the movement of
a tuple on disk?  What do you mean by same transactional constraints?

Chris



Re: DROP COLUMN (was RFC: Restructuring pg_aggregate)

From
Tom Lane
Date:
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
>> No, VACUUM has the same transactional constraints as everyone else
>> (unless you'd like a crash during VACUUM to trash your table...)

> Seriously, you can run VACUUM in a transaction and rollback the movement of
> a tuple on disk?  What do you mean by same transactional constraints?

In VACUUM FULL, tuples moved to compact the table aren't good until you
commit.  In this hypothetical column-drop-implementing VACUUM, I think
there'd need to be some similar rule --- otherwise it's not clear what
happens to TOASTED data if you crash partway through.  (In particular,
if we tried overwriting main tuples in place as Hannu was suggesting,
we'd need a way of being certain the deletion of the corresponding TOAST
rows occurs *before* we overwrite the only reference to them.)
        regards, tom lane


Re: RFC: Restructuring pg_aggregate

From
Hiroshi Inoue
Date:
Christopher Kings-Lynne wrote:
> 
> Also, it seems to me that at some point we are forced to break client
> compatibility.

It's not a users' consensus at all. I'm suspicious if
DROP COLUMN is such a significant feature to break
client compatibility at our ease.

> Either we add attisdropped field to pg_attribute, or we use
> Hiroshi's (-1 * attnum - offset) idea.  Both Tom and Hiroshi have good
> reasons for each of these - would it be possible for you guys to post with
> your reasons for and against both the techniques. 

I don't object to adding attisdropped field. What
I meant to say is that the differene is very small.

regards,
Hiroshi Inoue


Re: RFC: Restructuring pg_aggregate

From
Bruce Momjian
Date:
Thread added to TODO.detail/drop.

---------------------------------------------------------------------------

Christopher Kings-Lynne wrote:
> > Actually, what we need to do to reclaim space is to enable table
> > recreation without the column, now that we have relfilenode for file
> > renaming.  It isn't hard to do, but no one has focused on it.  I want to
> > focus on it, but have not had the time, obviously, and would be very
> > excited to assist someone else.
> >
> > Hiroshi's fine idea of marking certain columns as unused would not have
> > reclaimed the missing space, just as my idea of physical/logical column
> > distinction would not reclaim the space either.  Again, my
> > physical/logical idea is more for fixing other problems and
> > optimization, not DROP COLUMN.
> 
> Hmmm.  Personally, I think that a DROP COLUMN that cannot reclaim space is
> kinda useless - you may as well just use a view!!!
> 
> So how would this occur?:
> 
> 1. Lock target table for writing (allow reads)
> 2. Begin a table scan on target table, writing
>    a new file with a particular filenode
> 3. Delete the attribute row from pg_attribute
> 4. Point the table in the catalog to the new filenode
> 5. Release locks
> 6. Commit transaction
> 7. Delete orhpan filenode
> 
> i. Upon postmaster startup, remove any orphaned filenodes
> 
> The real problem here is the fact that there are now missing attnos in
> pg_attribute.  Either that's handled or we renumber the attnos - which is
> also quite hard?
> 
> This, of course, suffers from the double size data problem - but I believe
> that it does not matter - we just need to document it.
> 
> Interestingly enough, Oracle support
> 
> ALTER TABLE foo SET UNUSED col;
> 
> Which invalidates the attribute entry, and:
> 
> ALTER TABLE foo DROP col CHECKPOINT 1000;
> 
> Which actually reclaims the space.  The optional CHECKPOINT [n] clause
> tells Oracle to do a checkpoint every [n] rows.
> 
> "Checkpointing cuts down the amount of undo logs accumulated during the
> drop column operation to avoid running out of rollback segment space.
> However, if this statement is interrupted after a checkpoint has been
> applied, the table remains in an unusable state. While the table is
> unusable, the only operations allowed on it are DROP TABLE, TRUNCATE
> TABLE, and ALTER TABLE DROP COLUMNS CONTINUE (described below). "
> 
> Chris
> 
> 
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026