Thread: Remove implicit unique index creation on SERIAL columns?

Remove implicit unique index creation on SERIAL columns?

From
Rod Taylor
Date:
I'd like to propose dropping the auto-creation of UNIQUE indexes on
serial columns for the following reasons:

1. Serials with indexes are quite difficult to handle in pg_dump.  It
means that the implicitly created unique index must be destroyed prior
to loading the data, then re-created afterward else risk a performance
hit.


2. SERIAL columns are usually used as the primary key of the table.  As
such one must specify PRIMARY KEY at creation time to override the
implicitly created UNIQUE index.

3. Consistency with other databases.

MySQL's AUTO_INCREMENT suggests quite heavily that an index be applied,
but it doesn't appear to do it for you.

Oracle has SEQUENCES, but it is up to the user to associate them with a
column as per my understanding.  At least, thats all I could find.

SAP -> SERIAL and SERIAL(n).  No index is created (creates sequence /
default)



A nice side effect is that analyze.c will become somewhat cleaner.


Any thoughts?

Does anyone create serial columns without manually specifying UNIQUE
when wanting an index?  Examples in our documentation do.  See section
5.1.4 in the current docset:
http://developer.postgresql.org/docs/postgres/datatype.html


Removal of implicit UNIQUE index creation would not affect structure of
current or prior db versions -- strictly new tables on 7.3.




Re: Remove implicit unique index creation on SERIAL columns?

From
Joe Conway
Date:
Rod Taylor wrote:
> I'd like to propose dropping the auto-creation of UNIQUE indexes on
> serial columns for the following reasons:
> 
> 1. Serials with indexes are quite difficult to handle in pg_dump.  It
> means that the implicitly created unique index must be destroyed prior
> to loading the data, then re-created afterward else risk a performance
> hit.
> 
> 
> 2. SERIAL columns are usually used as the primary key of the table.  As
> such one must specify PRIMARY KEY at creation time to override the
> implicitly created UNIQUE index.
> 
> 3. Consistency with other databases.
> 
> MySQL's AUTO_INCREMENT suggests quite heavily that an index be applied,
> but it doesn't appear to do it for you.
> 
> Oracle has SEQUENCES, but it is up to the user to associate them with a
> column as per my understanding.  At least, thats all I could find.
> 
> SAP -> SERIAL and SERIAL(n).  No index is created (creates sequence /
> default)

FWIW, also MSSQL.

I agree 100%. If you want an index, unique constraint, or primary key on 
a SERIAL, I think you should explicitly add it. SERIAL should give me a 
column that automatically increments -- no more, no less.

Joe



Re: Remove implicit unique index creation on SERIAL columns?

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> I agree 100%. If you want an index, unique constraint, or primary key on 
> a SERIAL, I think you should explicitly add it. SERIAL should give me a 
> column that automatically increments -- no more, no less.

Hmm, do you also want to eliminate the implicit NOT NULL constraint?

I think that efficiency and orthogonality are adequate reasons for
dissociating UNIQUE from SERIAL.  The efficiency argument is pretty
weak in the case of the NOT NULL part, though, so maybe backwards
compatibility should win out there.

Another line of thought: as near as I can tell, SQL92 allows defaults
and CHECK constraints to be associated with domains.  Taking the
viewpoint that SERIAL is a domain would say that the DEFAULT nextval()
is okay, and CHECK NOT NULL is okay, but UNIQUE is not okay to be
implied by the domain type.  Perhaps the SQL authors knew what they
were doing when they made that restriction...

(Note that although I just dissuaded Rod from actually turning SERIAL
into a domain on compatibility grounds, I don't see any reason why
we shouldn't use the spec's rules about domains to reason about how
it should work.  In a slightly longer timeframe we may decide that
we do want to make it a domain.)
        regards, tom lane


Re: Remove implicit unique index creation on SERIAL columns?

From
Joe Conway
Date:
Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>>I agree 100%. If you want an index, unique constraint, or primary key on 
>>a SERIAL, I think you should explicitly add it. SERIAL should give me a 
>>column that automatically increments -- no more, no less.
> 
> Hmm, do you also want to eliminate the implicit NOT NULL constraint?
> 
> I think that efficiency and orthogonality are adequate reasons for
> dissociating UNIQUE from SERIAL.  The efficiency argument is pretty
> weak in the case of the NOT NULL part, though, so maybe backwards
> compatibility should win out there.

To be honest I wasn't thinking about NOT NULL. I'd agree with leaving 
that in place.

Maybe I should restate my comment above: SERIAL should give me a column 
that automatically increments -- no more, no less -- and it should not 
allow me to override the value that it gives. Hence an implicit NOT 
NULL, but also an implicit rejection of a manual insert/update of that 
field (how hard would this be to do?).

I know this causes problems for dumped and reloaded data. In MSSQL this 
is gotten around by allowing the properties of the data type to be 
altered, e.g. in MSSQL you can turn the IDENTITY property on or off (in 
fact, I just checked and MSSQL won't allow you to turn on IDENTITY for a 
column unless you turn off ALLOW NULLS). You can also specify an 
exception to the rule when running BCP (the bulk loader command line 
program).

The reason I think this behavior is good, is that it helps prevent toe 
loss from stray bullets. E.g. you manually add a row where you've 
specified some value that has not yet been reached by the sequence -- 
then when someday the sequence reaches said value, your insert fails on 
a duplicate primary key insertion attempt.

If you really need to be able to insert or update a field with an 
explicit value *sometimes* (and you really know what you're doing), then 
use a plain sequence and a default, not a SERIAL.

Anyway, just my thoughts.

Joe




Re: Remove implicit unique index creation on SERIAL columns?

From
Þórhallur Hálfdánarson
Date:
Hi

-*- Joe Conway <mail@joeconway.com> [ 2002-08-18 06:36 ]:
> Maybe I should restate my comment above: SERIAL should give me a column 
> that automatically increments -- no more, no less -- and it should not 
> allow me to override the value that it gives. Hence an implicit NOT 
> NULL, but also an implicit rejection of a manual insert/update of that 
> field (how hard would this be to do?).
> 
> I know this causes problems for dumped and reloaded data. In MSSQL this 
> is gotten around by allowing the properties of the data type to be 
> altered, e.g. in MSSQL you can turn the IDENTITY property on or off (in 
> fact, I just checked and MSSQL won't allow you to turn on IDENTITY for a 
> column unless you turn off ALLOW NULLS). You can also specify an 
> exception to the rule when running BCP (the bulk loader command line 
> program).
> 
> The reason I think this behavior is good, is that it helps prevent toe 
> loss from stray bullets. E.g. you manually add a row where you've 
> specified some value that has not yet been reached by the sequence -- 
> then when someday the sequence reaches said value, your insert fails on 
> a duplicate primary key insertion attempt.
> 
> If you really need to be able to insert or update a field with an 
> explicit value *sometimes* (and you really know what you're doing), then 
> use a plain sequence and a default, not a SERIAL.


I like the functionality you are suggesting.  However, when I started thinking about what size this field should be,
theidea of keeping the current SERIAL "data type" and having a SERIAL flag (CREATE TABLE foo (id int serial))... so
it'spretty much the same as a seqence, but enforced so that you can't tamper with it.  The best part about this is
obviouslythat it doesn't change behaviour of the old SERIAL.
 

Am I right about this, or am I missing something here?

-- 
Regards,
Tolli
tolli@tol.li


Re: Remove implicit unique index creation on SERIAL

From
Rod Taylor
Date:
On Sun, 2002-08-18 at 02:35, Joe Conway wrote:
> Tom Lane wrote:
> > Joe Conway <mail@joeconway.com> writes:
> >>I agree 100%. If you want an index, unique constraint, or primary key on 
> >>a SERIAL, I think you should explicitly add it. SERIAL should give me a 
> >>column that automatically increments -- no more, no less.
> > 
> > Hmm, do you also want to eliminate the implicit NOT NULL constraint?
> > 
> > I think that efficiency and orthogonality are adequate reasons for
> > dissociating UNIQUE from SERIAL.  The efficiency argument is pretty
> > weak in the case of the NOT NULL part, though, so maybe backwards
> > compatibility should win out there.
> 
> To be honest I wasn't thinking about NOT NULL. I'd agree with leaving 
> that in place.
> 
> Maybe I should restate my comment above: SERIAL should give me a column 
> that automatically increments -- no more, no less -- and it should not 
> allow me to override the value that it gives. Hence an implicit NOT 
> NULL, but also an implicit rejection of a manual insert/update of that 
> field (how hard would this be to do?).

I don't like not overriding the value.  A dataload example is a perfect
reason why you would want to.  Anyway, this would require placing 2
triggers on the table in order to prevent changes of the value.

Personally I prefer the method that SAPdb uses.  They make the column a
fixed() type (integer) and set the default.  Nothing about NOT NULL or
UNIQUE.

Anyway, I think SERIAL is about assisting creation of a entry, not
enforcing it.  Enforcement is trivial for those who don't mind the
additional overhead.



Re: Remove implicit unique index creation on SERIAL columns?

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> I know this causes problems for dumped and reloaded data.

Yup.

> In MSSQL this 
> is gotten around by allowing the properties of the data type to be 
> altered, e.g. in MSSQL you can turn the IDENTITY property on or off

Rather pointless if it can be turned off, wouldn't you say?

What I would do if I wanted such a guarantee is to make insertions
go through a view rule that prevents the serial column from being
supplied directly.  That's a general technique that applies to more than
just serial columns, and it is also easy to control who has privileges
to bypass the view.  AFAICT this "IDENTITY" thingie is an extremely
limited kluge.
        regards, tom lane


Re: Remove implicit unique index creation on SERIAL

From
Rod Taylor
Date:
> > In MSSQL this 
> > is gotten around by allowing the properties of the data type to be 
> > altered, e.g. in MSSQL you can turn the IDENTITY property on or off
> 
> Rather pointless if it can be turned off, wouldn't you say?
> 
> What I would do if I wanted such a guarantee is to make insertions
> go through a view rule that prevents the serial column from being

In the future if we have per column privileges, we could disable insert
and update on serial columns by default for everyone but the table owner
and super-users (to accommodate dataloads).

Till then a trigger or view will do the job for those who wish to add
them.



Re: Remove implicit unique index creation on SERIAL columns?

From
Joe Conway
Date:
Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>>In MSSQL this 
>>is gotten around by allowing the properties of the data type to be 
>>altered, e.g. in MSSQL you can turn the IDENTITY property on or off
> 
> Rather pointless if it can be turned off, wouldn't you say?

Not really. Turning it off takes an explict act of the dba (ALTER TABLE) 
for a special circumstance such as importing data. In the normal case 
while it's turned on, it is the developers who need and get the foot 
teflon (INSERT/UPDATE).


> What I would do if I wanted such a guarantee is to make insertions
> go through a view rule that prevents the serial column from being
> supplied directly.  That's a general technique that applies to more than
> just serial columns, and it is also easy to control who has privileges
> to bypass the view.

I thought about the view rule myself. Nice, but beyond the grasp of 
inexperienced users, who I think SERIAL is target for.

> AFAICT this "IDENTITY" thingie is an extremely limited kluge.

It is extremely limited -- that's why sequences are so much better 
(IMHO). But SERIAL is for those who want something simple to use, don't 
want to think about the implementation details, and don't mind it being 
limited. Otherwise why have it at all -- just make everyone use sequences.

Joe



Re: Remove implicit unique index creation on SERIAL columns?

From
Joe Conway
Date:
Rod Taylor wrote:
>>>In MSSQL this 
>>>is gotten around by allowing the properties of the data type to be 
>>>altered, e.g. in MSSQL you can turn the IDENTITY property on or off
>>
>>Rather pointless if it can be turned off, wouldn't you say?
>>
>>What I would do if I wanted such a guarantee is to make insertions
>>go through a view rule that prevents the serial column from being
> 
> 
> In the future if we have per column privileges, we could disable insert
> and update on serial columns by default for everyone but the table owner
> and super-users (to accommodate dataloads).

That sounds like the best way-to-go.


> Till then a trigger or view will do the job for those who wish to add
> them.

Agreed.

Joe



Re: Remove implicit unique index creation on SERIAL

From
Tom Lane
Date:
Since there didn't seem to be anyone objecting to the notion of
decoupling UNIQUE from SERIAL, I'm going to go ahead with
reviewing/applying Rod's recent patch that does that (and fixes pg_dump
to dump 7.3 serials correctly).  We can continue to debate about
the merits of making additional changes in SERIAL behavior, of course,
but this is the one that pg_dump needs.
        regards, tom lane


Re: Remove implicit unique index creation on SERIAL columns?

From
Bruce Momjian
Date:
Joe Conway wrote:
> Tom Lane wrote:
> > Joe Conway <mail@joeconway.com> writes:
> >>I agree 100%. If you want an index, unique constraint, or primary key on 
> >>a SERIAL, I think you should explicitly add it. SERIAL should give me a 
> >>column that automatically increments -- no more, no less.
> > 
> > Hmm, do you also want to eliminate the implicit NOT NULL constraint?
> > 
> > I think that efficiency and orthogonality are adequate reasons for
> > dissociating UNIQUE from SERIAL.  The efficiency argument is pretty
> > weak in the case of the NOT NULL part, though, so maybe backwards
> > compatibility should win out there.
> 
> To be honest I wasn't thinking about NOT NULL. I'd agree with leaving 
> that in place.
> 
> Maybe I should restate my comment above: SERIAL should give me a column 
> that automatically increments -- no more, no less -- and it should not 
> allow me to override the value that it gives. Hence an implicit NOT 
> NULL, but also an implicit rejection of a manual insert/update of that 
> field (how hard would this be to do?).

If don't understand.  We already have a unique index on the SERIAL
column, so why bother rejecting an insert/update that supplies the
value?  We need the column to be unique, and that is forced, but why
prevent _any_ unique value from being used.

Clearly NULL is not a valid value and should be prevented with NOT NULL.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Remove implicit unique index creation on SERIAL columns?

From
Stephan Szabo
Date:
On Mon, 19 Aug 2002, Bruce Momjian wrote:

> Joe Conway wrote:
> > Tom Lane wrote:
> > > Joe Conway <mail@joeconway.com> writes:
> > >>I agree 100%. If you want an index, unique constraint, or primary key on
> > >>a SERIAL, I think you should explicitly add it. SERIAL should give me a
> > >>column that automatically increments -- no more, no less.
> > >
> > > Hmm, do you also want to eliminate the implicit NOT NULL constraint?
> > >
> > > I think that efficiency and orthogonality are adequate reasons for
> > > dissociating UNIQUE from SERIAL.  The efficiency argument is pretty
> > > weak in the case of the NOT NULL part, though, so maybe backwards
> > > compatibility should win out there.
> >
> > To be honest I wasn't thinking about NOT NULL. I'd agree with leaving
> > that in place.
> >
> > Maybe I should restate my comment above: SERIAL should give me a column
> > that automatically increments -- no more, no less -- and it should not
> > allow me to override the value that it gives. Hence an implicit NOT
> > NULL, but also an implicit rejection of a manual insert/update of that
> > field (how hard would this be to do?).
>
> If don't understand.  We already have a unique index on the SERIAL
> column, so why bother rejecting an insert/update that supplies the
> value?  We need the column to be unique, and that is forced, but why
> prevent _any_ unique value from being used.

One reason is that the sequence won't respect those inserted values and
you'll get uniqueness errors on statements that don't give a value for the
column where you'd expect to be getting a working autogenerated value.



Re: Remove implicit unique index creation on SERIAL columns?

From
Bruce Momjian
Date:
Stephan Szabo wrote:
> > If don't understand.  We already have a unique index on the SERIAL
> > column, so why bother rejecting an insert/update that supplies the
> > value?  We need the column to be unique, and that is forced, but why
> > prevent _any_ unique value from being used.
> 
> One reason is that the sequence won't respect those inserted values and
> you'll get uniqueness errors on statements that don't give a value for the
> column where you'd expect to be getting a working autogenerated value.

Oh, it causes problems later on.  Interesting.  However, preventing
INSERT/UPDATE seems quite extreme.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073