Thread: Call for objections: put back OIDs in CREATE TABLE AS/SELECT INTO

Call for objections: put back OIDs in CREATE TABLE AS/SELECT INTO

From

Tom Lane

Date:

21 January 2003, 15:11:53

We've gotten a couple of complaints now about the fact that 7.3 doesn't
include an OID column in a table created via CREATE TABLE AS or SELECT
INTO.  Unless I hear objections, I'm going to revert it to including an
OID, and back-patch the fix for 7.3.2 as well.  See discussion a couple
days ago on pgsql-general, starting at
http://archives.postgresql.org/pgsql-general/2003-01/msg00669.php
        regards, tom lane

Re: Call for objections: put back OIDs in CREATE TABLE AS/SELECT INTO

From

"Christopher Kings-Lynne"

Date:

22 January 2003, 00:32:46

Why don't you just include them by default, otherwise if WITHOUT OIDS
appears in the CREATE TABLE command, then don't include them ?

Chris

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane
> Sent: Wednesday, 22 January 2003 4:12 AM
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] Call for objections: put back OIDs in CREATE TABLE
> AS/SELECT INTO
>
>
> We've gotten a couple of complaints now about the fact that 7.3 doesn't
> include an OID column in a table created via CREATE TABLE AS or SELECT
> INTO.  Unless I hear objections, I'm going to revert it to including an
> OID, and back-patch the fix for 7.3.2 as well.  See discussion a couple
> days ago on pgsql-general, starting at
> http://archives.postgresql.org/pgsql-general/2003-01/msg00669.php
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

Re: Call for objections: put back OIDs in CREATE TABLE AS/SELECT INTO

From

Tom Lane

Date:

22 January 2003, 09:48:09

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> Why don't you just include them by default, otherwise if WITHOUT OIDS
> appears in the CREATE TABLE command, then don't include them ?

Well, adding a WITHOUT OIDS option to CREATE TABLE AS would be a new
feature, which I don't have the time/interest for.  I won't do anything
to preclude someone else implementing that, though ;-)
        regards, tom lane

Re: Call for objections: put back OIDs in CREATE TABLE AS/SELECT INTO

From

"Christopher Kings-Lynne"

Date:

22 January 2003, 20:15:21

> "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> > Why don't you just include them by default, otherwise if WITHOUT OIDS
> > appears in the CREATE TABLE command, then don't include them ?
>
> Well, adding a WITHOUT OIDS option to CREATE TABLE AS would be a new
> feature, which I don't have the time/interest for.  I won't do anything
> to preclude someone else implementing that, though ;-)

Oh, I thought it already had it from the CREATE TABLE bit...  Does sound
like it would be a good ultimate solution tho.

Chris

Re: Call for objections: put back OIDs in CREATE TABLE

From

Curt Sampson

Date:

23 January 2003, 06:39:52

On Tue, 21 Jan 2003, Tom Lane wrote:

> We've gotten a couple of complaints now about the fact that 7.3 doesn't
> include an OID column in a table created via CREATE TABLE AS or SELECT
> INTO.  Unless I hear objections, I'm going to revert it to including an
> OID, and back-patch the fix for 7.3.2 as well.

I object. I personally think we should be moving towards not using OIDs
as the default behaviour, inasmuch as we can, for several reasons:

1. It's not a relational concept.

2. The OID wraparound problem can get you.

3. Other SQL databases don't do this.

4. It's hidden, rather than exposed, and hidden things are generally a
bad idea.

5. We should default to what gives us better performance, rather than
worse.

> See discussion a couple days ago on pgsql-general, starting at
> http://archives.postgresql.org/pgsql-general/2003-01/msg00669.php

There didn't seem to be many people clamouring to have it back.

The ideal sitaution for me would be to have WITHOUT OIDS be the default
for all table creations, and but of course allow WITH OIDS for backward
compatability. But yeah, I know that this can introduce problems with
old dumps, and may not be entirely easy to implement.

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org   Don't you know, in this new Dark Age, we're
alllight.  --XTC

Re: Call for objections: put back OIDs in CREATE TABLE

From

"D'Arcy J.M. Cain"

Date:

23 January 2003, 06:54:48

On Thursday 23 January 2003 06:34, Curt Sampson wrote:
> The ideal sitaution for me would be to have WITHOUT OIDS be the default
> for all table creations, and but of course allow WITH OIDS for backward

Why not make it a configuration option?  I can actually think of a third 
behaviour that would make sense for me.  Have it so that OIDs are created by 
default if there is no primary key defined and not if there is.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.

Re: Call for objections: put back OIDs in CREATE TABLE AS/SELECT INTO

From

Tom Lane

Date:

23 January 2003, 10:03:58

Curt Sampson <cjs@cynic.net> writes:
> I object. I personally think we should be moving towards not using OIDs
> as the default behaviour, inasmuch as we can, for several reasons:

All these objections are global in nature, not specific to CREATE TABLE
AS.  The argument that persuaded me to do something here is that CREATE
TABLE AS should not be different from CREATE TABLE's default behavior.

I have no problem with moving towards lack-of-OIDs as the default
behavior for both statements, in the long run, if we can get past the
compatibility issues.  But I don't think OIDs in user tables are costing
us anything much, so I'm not prepared to take any big compatibility hit
to change the default ...
        regards, tom lane

Re: Call for objections: put back OIDs in CREATE TABLE AS/SELECT INTO

From

"Ross J. Reedstrom"

Date:

23 January 2003, 12:45:52

On Thu, Jan 23, 2003 at 10:03:28AM -0500, Tom Lane wrote:
> Curt Sampson <cjs@cynic.net> writes:
> > I object. I personally think we should be moving towards not using OIDs
> > as the default behaviour, inasmuch as we can, for several reasons:
> 
> All these objections are global in nature, not specific to CREATE TABLE
> AS.  The argument that persuaded me to do something here is that CREATE
> TABLE AS should not be different from CREATE TABLE's default behavior.
> 
> I have no problem with moving towards lack-of-OIDs as the default
> behavior for both statements, in the long run, if we can get past the
> compatibility issues.  But I don't think OIDs in user tables are costing
> us anything much, so I'm not prepared to take any big compatibility hit
> to change the default ...

Agreed as to taking the compatability hit in the 7.3 branch (you _were_
talking about changing 7.3, weren't you?) But I think Curt and D'Arcy
have a point: what OIDs are costing the DBAs and PostgreSQL developing
community is the pain of having an 'almost' solution in place. OIDs have
always been the unwanted child in PostgreSQL: the 'pure relational' people
don't want them, and the Object people are misled into thinking we've got
a _real_ object id. On the relational side, they've stood in for proper
use of primary keys (as D'Arcy points out), partly because it's so _easy_
to misuse them that way: the wire protocol returns the OID for free in
some cases, and the interface libraries make it easy to get at.

So the immediate case, changing the default (in 7.3) to match the CREATE
TABLE case makes sense. However, we need to wean developers off using
OIDs.  I've been working with Diedrich Vorberg on a thin python object
relational mapping interface (his Object Relational Membrane - ORM)
and this was a central problem: you _need_ a unique id for an object,
and the oid seemed so natural ... 

So in the longer term, we need to provide a replacement. Arguably, the
primary key for a table is the right replacement, but we don't _require_
a pkey, so what to do in cases where this isn't one?  Also, the pkey
can be _any_ column(s), of _any_ type, which could be inconvenient for
returning as the result of an insert, for example (imagine a text field
as pkey, with a _huge_ block of text just written into it ...)

Ross

Re: Call for objections: put back OIDs in CREATE TABLE

From

Hannu Krosing

Date:

23 January 2003, 13:09:45

Curt Sampson kirjutas N, 23.01.2003 kell 13:34:
> On Tue, 21 Jan 2003, Tom Lane wrote:
> 
> > We've gotten a couple of complaints now about the fact that 7.3 doesn't
> > include an OID column in a table created via CREATE TABLE AS or SELECT
> > INTO.  Unless I hear objections, I'm going to revert it to including an
> > OID, and back-patch the fix for 7.3.2 as well.
> 
> I object. I personally think we should be moving towards not using OIDs
> as the default behaviour, inasmuch as we can, for several reasons:

I re-object

> 1. It's not a relational concept.

so are other system tuples (cid, tid, tableiod, ...).

It is an OO concept.

> 2. The OID wraparound problem can get you.

put an unique index on OID column.

> 3. Other SQL databases don't do this.

Ask Date, hell tell you that SQL is evil, i.e. not relational ;)

> 4. It's hidden, rather than exposed, and hidden things are generally a
> bad idea.

AFAIK carrying hidden weapons is forbidden in most of USA, in Europe you
usually are forbidden to carry hand-weapons _exposed_  ;)

> 5. We should default to what gives us better performance, rather than
> worse.

Not if it breaks anything ;)

> > See discussion a couple days ago on pgsql-general, starting at
> > http://archives.postgresql.org/pgsql-general/2003-01/msg00669.php
> 
> There didn't seem to be many people clamouring to have it back.
> 
> The ideal sitaution for me would be to have WITHOUT OIDS be the default
> for all table creations, and but of course allow WITH OIDS for backward
> compatability. But yeah, I know that this can introduce problems with
> old dumps, and may not be entirely easy to implement.

If you need a no-OID table, and INSERT INTO it.

> cjs
-- 
Hannu Krosing <hannu@tm.ee>

Re: Call for objections: put back OIDs in CREATE TABLE

From

Curt Sampson

Date:

23 January 2003, 13:20:08

On Thu, 23 Jan 2003, Ross J. Reedstrom wrote:

> So in the longer term, we need to provide a replacement. Arguably, the
> primary key for a table is the right replacement, but we don't _require_
> a pkey, so what to do in cases where this isn't one?

You're stuck. SQL breaks with relational theory in this way; tables
need not have candidate keys, and thus you can have duplicate rows in
a table. (I.e., mathamatically, a table is not a set, it's a bag.) The
implications of this start to go on and on, but let's not get into that
here.

> Also, the pkey can be _any_ column(s), of _any_ type, which could be
> inconvenient for returning as the result of an insert, for example
> (imagine a text field as pkey, with a _huge_ block of text just
> written into it ...)

Well, this could be worked around to some extent, with some hackery.
But in the end I reckon it's much easier just to have the object system
force you to declare specific a specific object-ID column, if that's
what it takes. So long as you've got a candidate key, even if it's not
the primary key, you're fine.

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org   Don't you know, in this new Dark Age, we're
alllight.  --XTC

Re: Call for objections: put back OIDs in CREATE TABLE

From

Curt Sampson

Date:

23 January 2003, 13:54:00

On Fri, 23 Jan 2003, Hannu Krosing wrote:

> > 1. [OIDs are] not a relational concept.
> so are other system tuples (cid, tid, tableiod, ...).

But there's a key difference here; nobody's advertising these others as
any sort of row identifier: i.e., a candidate key. And besides, I wouldn't
object at all to getting rid of these, except that they store essential
system information and I can't figure out how to get rid of them. :-)

> It is an OO concept.

Well, it's not, because we have an OID wrap-around problem, so an OID is
actually not an OID at all, but simply an arbitrary number tacked on to
a row. Other rows, in the same or other tables can have the same OID.

> > 2. The OID wraparound problem can get you.
> put an unique index on OID column.

That still doesn't make it a real OID, because you can't guarantee that
two rows in different tables won't have the same OID.

> > 3. Other SQL databases don't do this.
> Ask Date, hell tell you that SQL is evil, i.e. not relational ;)

I did, he said that, and I agreed with him. :-) So now we have something
that's evil because it's not relational and also evil because it's not
SQL. Double-yuck!

> > 5. We should default to what gives us better performance, rather than
> > worse.
> Not if it breaks anything ;)

I disagree. We have to weigh the cost of the breakage versus the
benefits in each individual circumstance. We've broken plenty of things
before because we felt it was better to do so than maintain backward
compatability.

Because of its history as a research tool, there's a lot of experimental
stuff in postgres that, in hindsight, we can say didn't work so well.
When we find something that's not working so well, we should at least
consider making some sort of move toward the "right thing," rather than
continuing to do the wrong thing forever just for the sake of backwards
compatability.

Summary: I don't want to hear absolutes like "we should never break
backwards compatibility." I want to hear arguments that the cost of
breaking backwards compatability is X, and the benefit of the new way of
doing things is Y, and here is why you think X > Y.

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org   Don't you know, in this new Dark Age, we're
alllight.  --XTC

Re: Call for objections: put back OIDs in CREATE TABLE AS/SELECT

From

Bruce Momjian

Date:

26 January 2003, 15:41:41

Ross, you make some powerful arguments here.  Probably the most
significant was the idea that you need a unique identifier for every
row, and it should be of a consistent type, which primary key is not.

We clearly need a GUC parameter to turn on/off oids.  But it seems we
will always need the ability to return something like OID to the user if
the user wants it. What it seems we need is a 64-bit oid someday.

As an aside, as Tom already said, the 7.3.X patch is just to make CREATE
TABLE and CREATE TABLE AS behave the same for OIDs.  It does not effect
our defaults for future releases, though this little change in 7.3.0 did
show use that some folks are using OID and did miss them.

---------------------------------------------------------------------------

Ross J. Reedstrom wrote:
> On Thu, Jan 23, 2003 at 10:03:28AM -0500, Tom Lane wrote:
> > Curt Sampson <cjs@cynic.net> writes:
> > > I object. I personally think we should be moving towards not using OIDs
> > > as the default behaviour, inasmuch as we can, for several reasons:
> > 
> > All these objections are global in nature, not specific to CREATE TABLE
> > AS.  The argument that persuaded me to do something here is that CREATE
> > TABLE AS should not be different from CREATE TABLE's default behavior.
> > 
> > I have no problem with moving towards lack-of-OIDs as the default
> > behavior for both statements, in the long run, if we can get past the
> > compatibility issues.  But I don't think OIDs in user tables are costing
> > us anything much, so I'm not prepared to take any big compatibility hit
> > to change the default ...
> 
> Agreed as to taking the compatability hit in the 7.3 branch (you _were_
> talking about changing 7.3, weren't you?) But I think Curt and D'Arcy
> have a point: what OIDs are costing the DBAs and PostgreSQL developing
> community is the pain of having an 'almost' solution in place. OIDs have
> always been the unwanted child in PostgreSQL: the 'pure relational' people
> don't want them, and the Object people are misled into thinking we've got
> a _real_ object id. On the relational side, they've stood in for proper
> use of primary keys (as D'Arcy points out), partly because it's so _easy_
> to misuse them that way: the wire protocol returns the OID for free in
> some cases, and the interface libraries make it easy to get at.
> 
> So the immediate case, changing the default (in 7.3) to match the CREATE
> TABLE case makes sense. However, we need to wean developers off using
> OIDs.  I've been working with Diedrich Vorberg on a thin python object
> relational mapping interface (his Object Relational Membrane - ORM)
> and this was a central problem: you _need_ a unique id for an object,
> and the oid seemed so natural ... 
> 
> So in the longer term, we need to provide a replacement. Arguably, the
> primary key for a table is the right replacement, but we don't _require_
> a pkey, so what to do in cases where this isn't one?  Also, the pkey
> can be _any_ column(s), of _any_ type, which could be inconvenient for
> returning as the result of an insert, for example (imagine a text field
> as pkey, with a _huge_ block of text just written into it ...)
> 
> Ross
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Call for objections: put back OIDs in CREATE TABLE

From

Curt Sampson

Date:

26 January 2003, 17:48:48

On Sun, 26 Jan 2003, Bruce Momjian wrote:

> Ross, you make some powerful arguments here.  Probably the most
> significant was the idea that you need a unique identifier for every
> row, and it should be of a consistent type, which primary key is not.

I don't see why you need a unqiue identifier per row, nor do I see why,
if you are going to have one, it needs to be the same type across all
tables.

Having this may be very desirable, and even necessary, for many or
all object-to-relational mapping frameworks, but that is certainly not
the only thing that postgres is used for. And I still maintain that
if something does need something like of OIDs, it should be declared
explicitly in the database schema (as you have to do in other DBMSes)
and not use a "hidden" feature.

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org   Don't you know, in this new Dark Age, we're
alllight.  --XTC

Re: Call for objections: put back OIDs in CREATE TABLE

From

Antti Haapala

Date:

27 January 2003, 02:49:20

> > Ross, you make some powerful arguments here.  Probably the most
> > significant was the idea that you need a unique identifier for every
> > row, and it should be of a consistent type, which primary key is not.
>
> I don't see why you need a unqiue identifier per row, nor do I see why,
> if you are going to have one, it needs to be the same type across all
> tables.

If i had table with multi col primary key like...
create table devices (    major int4,    minor int4,    primary key (major, minor));

... and do this:
insert into devices (major, minor values (224, find_free_minor_for(224))

should the database report something like
INSERT '{<([\'224\', \'89\'])>}' 1

which I could then parse in my client program and try to recover my
fresh brand new primary key from it? No thanks...

Anyways, I've got an idea: what about having option that INSERTs return
"oid_status" in form
      major = '224' and minor = '10'
or      state = 'ca'
?

Then you could just throw this expression into a select query after where
;P And tables would never need row oids...

-- 
Antti Haapala

Re: Call for objections: put back OIDs in CREATE TABLE

From

Curt Sampson

Date:

27 January 2003, 03:03:23

On Mon, 27 Jan 2003, Antti Haapala wrote:

> > I don't see why you need a unqiue identifier per row, nor do I see why,
> > if you are going to have one, it needs to be the same type across all
> > tables.

(Note here: it may not have been quite clear, but I'm not asking for
specific instances of where you might want to do this; I'm asking why it
should be forced upon every single table in the world, unless people a)
know that postgresql does this, and b) use special SQL extensions that
are not compatable with any other DMBS in the world.)

> If i had table with multi col primary key like...
>
>     create table devices (
>         major int4,
>         minor int4,
>         primary key (major, minor)
>     );
>
> ... and do this:
>
>     insert into devices (major, minor values (224, find_free_minor_for(224))
>
> should the database report something like
>
>     INSERT '{<([\'224\', \'89\'])>}' 1
>
> which I could then parse in my client program and try to recover my
> fresh brand new primary key from it? No thanks...

It's up to you. It sounds like in this particular application, you want a
single integer as the primary key. So I have no objection to you changing
the table to be
   create table devices (id serial PRIMARY KEY,major int4,minor int4,CONSTRAINT major_minor_unique UNIQUE (major,
minor)  );

and then selecting currval('devices_id_seq') in order to find out what the
id of that record is.

But my first question here is, why do you want to do this with what is
effectively a hidden column, rather than explicitly showing that you need
this, as above? And why do you want to run the risk of OID wraparound when
you don't have to?

Next, other applications might not need to parse whatever the database
reports, or may know in advance what they've inserted. So why do you want
to, by default, impose the overhead of this special hidden column on these
other applications?

> Anyways, I've got an idea: what about having option that INSERTs return
> "oid_status" in form...

I don't understand exactly how an INSERT statement "returns" anything.
An INSERT statement is not a function, is it?

However, I have no objection to adding a function or other method to get
the primary key of the most recent insertion, assuming it exists, for
those folks with multi-column primary keys. Presumably it would generate
a result set just like a regular SELECT....

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org   Don't you know, in this new Dark Age, we're
alllight.  --XTC

Re: Call for objections: put back OIDs in CREATE TABLE

From

Antti Haapala

Date:

27 January 2003, 06:47:48

> > Anyways, I've got an idea: what about having option that INSERTs return
> > "oid_status" in form...
>
> I don't understand exactly how an INSERT statement "returns" anything.
> An INSERT statement is not a function, is it?

I mean the backend message CompletedResponse (and
s/oid_status/PQoidStatus/ (as it's written in libpq-fe.h)) (ok, it's
deprecated now in favor of PQoidValue).

From postgresql docs see section Frontend-Backend protocol:
   Byte1('C')
       Identifies the message as a completed response.
   String
       The  command  tag. This is usually a single word thatidentifies which SQL command was completed.
       For  an  INSERT  command,  the tag is INSERT oid rows,       where rows is the number of rows  inserted,  and
oidis the  object  ID  of  the inserted row if rows is 1,       otherwise oid is 0.
 


Wouldn't it be nice to add here
If table doesn't contain row oids, in place of oid       is the primary key of the newly inserted record (if any)as
column= 'value' [ and column = 'value [ and ... ]]
 


-- 
Antti Haapala

Re: Call for objections: put back OIDs in CREATE TABLE

From

Curt Sampson

Date:

27 January 2003, 17:49:05

On Mon, 27 Jan 2003, Antti Haapala wrote:

>        For an INSERT command, the tag is INSERT oid rows, where rows
>       is the number of rows inserted, and oid is the object ID of the
>       inserted row if rows is 1, otherwise oid is 0.
>
> Wouldn't it be nice to add here
>
>       If table doesn't contain row oids, in place of oid is the
>       primary key of the newly inserted record (if any) as column =
>       'value' [ and column = 'value [ and ... ]]

Well, as was pointed out, that's a lot of data to send back if your
primary key is a huge text column, and you've still got a problem if
you have a multi-column primary key.

Since this sort of functionality is not all that frequently needed, I'd
still say it would probably be cleaner to make a new query that selects
the most recently inserted primary key. That means that a) you don't
have to send back a potentially large amount of data unless the user
asks for it, and b) multi-column primary keys work just fine.

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org   Don't you know, in this new Dark Age, we're
alllight.  --XTC