Thread: unique constraint instead of primary key? what disadvantage(ODBC usage)?

unique constraint instead of primary key? what disadvantage(ODBC usage)?

From
george young
Date:
[PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1]
I'm designing a completely new schema for my database.  A major
criterion is that it facilitate ad-hoc queries via MS-access, excel and
OpenOffice, presumably with ODBC.

My question regards the use of UNIQUE constraints instead of PRIMARY
KEY's on some tables.  Both result in an index on the specified tuple
of fields, so I presume query performance shouldn't be much different.

Using UNIQUE constraints seems to let me better match the natural
structure of my data.  A 'run' contains a sequence of 'opsets'.
Each opset contains a sequence of (a few) 'step's.
  run-foo     opset-1        step-1        step-2     opset-2        step-1

So the 'steps' table is logically indexed by (run, opset_num, step_num).
But some opsets are not in runs, and some steps are not in opsets, so
I would have step.run be null in some cases, likewise step.opset_num.

Null values mean I can't use these fields in a primary key, so I
propose to use UNIQUE constraints instead.

What am I losing by not using PRIMARY KEYS?  Will ODBC clients have
difficulty dealing nicely with the database?  Will the planner produce
lousy query plans?  Will Edgar Codd haunt my dreams?

-- George Young


-- 
"Are the gods not just?"  "Oh no, child.
What would become of us if they were?" (CSL)


Re: unique constraint instead of primary key? what

From
Ken Hill
Date:
On Wed, 2006-02-08 at 21:04 -0500, george young wrote: <blockquote type="CITE"><pre>
<font color="#000000">[PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1]</font>
<font color="#000000">I'm designing a completely new schema for my database.  A major</font>
<font color="#000000">criterion is that it facilitate ad-hoc queries via MS-access, excel and</font>
<font color="#000000">OpenOffice, presumably with ODBC.</font>

<font color="#000000">My question regards the use of UNIQUE constraints instead of PRIMARY</font>
<font color="#000000">KEY's on some tables.  Both result in an index on the specified tuple</font>
<font color="#000000">of fields, so I presume query performance shouldn't be much different.</font>

<font color="#000000">Using UNIQUE constraints seems to let me better match the natural</font>
<font color="#000000">structure of my data.  A 'run' contains a sequence of 'opsets'.</font>
<font color="#000000">Each opset contains a sequence of (a few) 'step's.</font>

<font color="#000000">   run-foo</font>
<font color="#000000">      opset-1</font>
<font color="#000000">         step-1</font>
<font color="#000000">         step-2</font>
<font color="#000000">      opset-2</font>
<font color="#000000">         step-1</font>

<font color="#000000">So the 'steps' table is logically indexed by (run, opset_num, step_num).</font>
<font color="#000000">But some opsets are not in runs, and some steps are not in opsets, so</font>
<font color="#000000">I would have step.run be null in some cases, likewise step.opset_num.</font>

<font color="#000000">Null values mean I can't use these fields in a primary key, so I</font>
<font color="#000000">propose to use UNIQUE constraints instead.</font>

<font color="#000000">What am I losing by not using PRIMARY KEYS?  Will ODBC clients have</font>
<font color="#000000">difficulty dealing nicely with the database?  Will the planner produce</font>
<font color="#000000">lousy query plans?  Will Edgar Codd haunt my dreams?</font>

<font color="#000000">-- George Young</font>


</pre></blockquote> I think I can give you some insights about MS Access to help you. In MS Access, you can specify a
columnas a "primary key"; which basically means the column is indexed and must contain unique values (also, nulls are
notallowed). I have run into problems depending on columns being "primary key" in MS Access in db apps that receive
datain batch file uploads from other sources (e.g., uploading 1,000+ records into a table). <br /><br /> Is sounds like
yourrequirement to use MS Access for ad-hoc queries means that you will have some users that want to access the
databasewith MS Access as a "front-end" client tool. If that is the situation, then you don't need to worry about the
structureof the table as MS Access relies on ODBC for this. You may also want to communicate to the end users that MS
Accessis not a client-server tool; in other words, all of the records are transferred from the server to the client's
boxand then the query is executed.<br /><br /> I hope that helps.  

Re: unique constraint instead of primary key? what

From
george young
Date:
On Wed, 08 Feb 2006 18:34:22 -0800
Ken Hill <ken@scottshill.com> threw this fish to the penguins:

> On Wed, 2006-02-08 at 21:04 -0500, george young wrote:
> 
> > [PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1]
> > I'm designing a completely new schema for my database.  A major
> > criterion is that it facilitate ad-hoc queries via MS-access, excel and
> > OpenOffice, presumably with ODBC.
> > 
> > My question regards the use of UNIQUE constraints instead of PRIMARY
> > KEY's on some tables.  Both result in an index on the specified tuple
> > of fields, so I presume query performance shouldn't be much different.
> > 
> > Using UNIQUE constraints seems to let me better match the natural
> > structure of my data.  A 'run' contains a sequence of 'opsets'.
> > Each opset contains a sequence of (a few) 'step's.
> > 
> >    run-foo
> >       opset-1
> >          step-1
> >          step-2
> >       opset-2
> >          step-1
> > 
> > So the 'steps' table is logically indexed by (run, opset_num, step_num).
> > But some opsets are not in runs, and some steps are not in opsets, so
> > I would have step.run be null in some cases, likewise step.opset_num.
> > 
> > Null values mean I can't use these fields in a primary key, so I
> > propose to use UNIQUE constraints instead.
> > 
> > What am I losing by not using PRIMARY KEYS?  Will ODBC clients have
> > difficulty dealing nicely with the database?  Will the planner produce
> > lousy query plans?  Will Edgar Codd haunt my dreams?
> > 
> > -- George Young
> > 
> > 
> 
> I think I can give you some insights about MS Access to help you. In MS
> Access, you can specify a column as a "primary key"; which basically
> means the column is indexed and must contain unique values (also, nulls
> are not allowed). I have run into problems depending on columns being
> "primary key" in MS Access in db apps that receive data in batch file
> uploads from other sources (e.g., uploading 1,000+ records into a
> table). 
> 
> Is sounds like your requirement to use MS Access for ad-hoc queries
> means that you will have some users that want to access the database
> with MS Access as a "front-end" client tool. If that is the situation,
> then you don't need to worry about the structure of the table as MS
> Access relies on ODBC for this. You may also want to communicate to the
> end users that MS Access is not a client-server tool; in other words,
> all of the records are transferred from the server to the client's box
> and then the query is executed.

Ouch!  A good portion of queries will access my 4M row parameter table
in joins with other tables.  It sounds like MS access is not workable.
Thanks for the info.

> 
> I hope that helps.
> 


-- 
"Are the gods not just?"  "Oh no, child.
What would become of us if they were?" (CSL)


Re: unique constraint instead of primary key? what

From
Richard Huxton
Date:
george young wrote:
> On Wed, 08 Feb 2006 18:34:22 -0800
> Ken Hill <ken@scottshill.com> threw this fish to the penguins:
> 
>> On Wed, 2006-02-08 at 21:04 -0500, george young wrote:
>>>
>>> So the 'steps' table is logically indexed by (run, opset_num, step_num).
>>> But some opsets are not in runs, and some steps are not in opsets, so
>>> I would have step.run be null in some cases, likewise step.opset_num.
>>>
>>> Null values mean I can't use these fields in a primary key, so I
>>> propose to use UNIQUE constraints instead.

NULL means "unknown", so a UNIQUE constraint on (run, opset_num, NULL) 
probably doesn't do what you want it to.
At its most basic, ('A',1,null) does NOT equal ('A',1,null). It can't do 
so - two unknowns can't be said to be the same.

>> Is sounds like your requirement to use MS Access for ad-hoc queries
>> means that you will have some users that want to access the database
>> with MS Access as a "front-end" client tool. If that is the situation,
>> then you don't need to worry about the structure of the table as MS
>> Access relies on ODBC for this. You may also want to communicate to the
>> end users that MS Access is not a client-server tool; in other words,
>> all of the records are transferred from the server to the client's box
>> and then the query is executed.
> 
> Ouch!  A good portion of queries will access my 4M row parameter table
> in joins with other tables.  It sounds like MS access is not workable.
> Thanks for the info.

You can have Access "pass through" queries though which does what you want.

--   Richard Huxton  Archonet Ltd


Re: unique constraint instead of primary key? what disadvantage(ODBC usage)?

From
"BigSmoke"
Date:
If my tables have one or more UNIQUE constraints/indices, I still add a
"id SERIAL PRIMARY KEY" field to most of my tables. This makes
referencing easier and faster. It also improves consistency, which is
never a bad thing in my opinion.

As far as I know, though, PRIMARY KEY does the same thing as UNIQUE NOT
NULL in PostgreSQL.

The reason that PRIMARY KEY can't be NULL and _has to be_ UNIQUE is
that it is the primary means of identifying a given record in a table.
If you don't have PRIMARY KEY that is UNIQUE and NOT NULL, how are you
going to identify (or reference) individual records? PostgreSQL won't
allow you to reference more than one row for what I came to believe are
very good reasons.
 - Rowan



"BigSmoke" <bigsmoke@gmail.com> writes:
> As far as I know, though, PRIMARY KEY does the same thing as UNIQUE NOT
> NULL in PostgreSQL.

They are 99.9% the same --- the *only* difference AFAIR is that PRIMARY
KEY establishes a default column-to-reference for FOREIGN KEY references
to the table.  UNIQUE doesn't; you can make a FOREIGN KEY reference to a
column that's only UNIQUE, but you'll always have to specify which column.
        regards, tom lane


Re: unique constraint instead of primary key? what

From
george young
Date:
On 9 Feb 2006 08:22:59 -0800
"BigSmoke" <bigsmoke@gmail.com> threw this fish to the penguins:

> If my tables have one or more UNIQUE constraints/indices, I still add a
> "id SERIAL PRIMARY KEY" field to most of my tables. This makes
> referencing easier and faster. It also improves consistency, which is
> never a bad thing in my opinion.

In this schema overhaul I'm trying to *eliminate* arbitrary "id" columns like
this.  They may sometimes improve performance, but certainly obscure the
meaning of the data -- naive [read-only] users accessing through excel are
confused by this sort of thing.

> As far as I know, though, PRIMARY KEY does the same thing as UNIQUE NOT
> NULL in PostgreSQL.
> 
> The reason that PRIMARY KEY can't be NULL and _has to be_ UNIQUE is
> that it is the primary means of identifying a given record in a table.
> If you don't have PRIMARY KEY that is UNIQUE and NOT NULL, how are you
> going to identify (or reference) individual records? PostgreSQL won't
> allow you to reference more than one row for what I came to believe are
> very good reasons.

Actually, I don't really see a problem here.  E.g. when I want data from
a step that has no parent run and no parent opset, I could say:

select * from steps where run is null and opset is null and step='mystep';

I don't understand what you mean by "PostgreSQL won't allow you to
reference more than one row".  If the above query returns 10 rows,
what's wrong with that?  Nothing *requires* a PRIMARY KEY at all, anyway.

-- puzzled...George


-- 
"Are the gods not just?"  "Oh no, child.
What would become of us if they were?" (CSL)


Re: unique constraint instead of primary key? what

From
"BigSmoke"
Date:
I mean that you can't easily base a foreign key constraint on a field
that is not NOT NULL UNIQUE.
 - Rowan



Re: unique constraint instead of primary key? what

From
Chris Browne
Date:
gry@ll.mit.edu (george young) writes:
> On 9 Feb 2006 08:22:59 -0800
> "BigSmoke" <bigsmoke@gmail.com> threw this fish to the penguins:
>
>> If my tables have one or more UNIQUE constraints/indices, I still add a
>> "id SERIAL PRIMARY KEY" field to most of my tables. This makes
>> referencing easier and faster. It also improves consistency, which is
>> never a bad thing in my opinion.
>
> In this schema overhaul I'm trying to *eliminate* arbitrary "id" columns like
> this.  They may sometimes improve performance, but certainly obscure the
> meaning of the data -- naive [read-only] users accessing through excel are
> confused by this sort of thing.

Actually, that's not the only people that get confused.

I'm not usually considered overly naive, and there is a database that
I have been using for reverse-engineering purposes of late which is
filled with both "id" and "v_id" columns which *very* much obscure the
meaning of the data.

I'm not adverse to having some such thing; I *am* adverse to giving
them such generic names.

It would be nice to be certain that the "would-be primary key
characteristics" that you'd probably rather use are sufficiently
permanent to be satisfactory.  Unfortunately, they often aren't, or
aren't acceptably usable.
-- 
output = reverse("gro.gultn" "@" "enworbbc")
http://cbbrowne.com/info/oses.html
"To conquer the enemy without  resorting to war is the most desirable.
The highest form of generalship  is to conquer the enemy by strategy."
-- Sun Tzu, "The Art of War"