Thread: are primary keys always 'needed'

are primary keys always 'needed'

From
Serge Fonville
Date:
Hi,

I was wondering if primary keys are always needed.
What I understand;
- The value of a primary key is unique per table
- Primary keys are unique and not null.

I have a couple of tables.
Some have a serial that is used as a foreign key in another table.
Some tables consist of a combination of two foreign keys (that are
unique together) and a field that is uniquely related to that
combination (but is not necessarily unique within the table)
Others have no unique field or combination at all

For the tables that are only used in the foreign part of the
relationship(s), is there an added value for user of a surrogate
primary key, which will never be referenced.

I did a lot of googling and found very varying opinions.
Mainly:
- Always use a primary key, no reason why, it's just 'better'
- Use primary keys when it makes sense.

How do I determine what 'sense' or 'better' mean?

Thanks for all the feedback

Regards,

Serge Fonville

--
http://www.sergefonville.nl

Convince Google!!
They need to support Adsense over SSL
https://www.google.com/adsense/support/bin/answer.py?hl=en&answer=10528
http://www.google.com/support/forum/p/AdSense/thread?tid=1884bc9310d9f923&hl=en

Re: are primary keys always 'needed'

From
Tom Lane
Date:
Serge Fonville <serge.fonville@gmail.com> writes:
> I was wondering if primary keys are always needed.

As you say, there are different schools of thought on that.  To my mind,
the principal argument for "always have a primary key" is that it
guarantees that no two rows in your table are exactly alike.  The reason
why that's a good idea is that SQL only allows you to reference rows by
content.  If you have two identical rows, and you wish to get rid of
just one of them, how you gonna do that?  If there is no difference you
can use in DELETE WHERE ..., you're stuck.

Now there is an "out" available in Postgres, because in PG there is
always a hidden system column called "ctid" which is the row's physical
location (to oversimplify a bit).  So when your back is to the wall you
can reference a specific row by ctid, even if you neglected to provide
a primary key.  But not all DBMSes have an equivalent concept.

Anyway, I'm not in agreement with the school that says you must always
have an explicitly named primary key.  Sometimes there are other
constraints that have the same effect of ensuring no duplicates, and
sometimes you just plain don't care if there are duplicates.  But it's
always a good idea to think about that carefully.  I believe that in
*most* cases you want a primary key.

            regards, tom lane

Re: are primary keys always 'needed'

From
Tom Lane
Date:
Serge Fonville <serge.fonville@gmail.com> writes:
> Some have a serial that is used as a foreign key in another table.
> Some tables consist of a combination of two foreign keys (that are
> unique together) and a field that is uniquely related to that
> combination (but is not necessarily unique within the table)

BTW, I forgot to mention that it's perfectly reasonable to have a
multi-column primary key, which is what seems to be indicated in
this type of example.  I wouldn't advocate making up a surrogate
primary key in a linking table, if the combination of its foreign
keys can do the job.

            regards, tom lane

Re: are primary keys always 'needed'

From
Serge Fonville
Date:
Thanks for the reply.

>> Some have a serial that is used as a foreign key in another table.
>> Some tables consist of a combination of two foreign keys (that are
>> unique together) and a field that is uniquely related to that
>> combination (but is not necessarily unique within the table)
>
> BTW, I forgot to mention that it's perfectly reasonable to have a
> multi-column primary key, which is what seems to be indicated in
> this type of example.  I wouldn't advocate making up a surrogate
> primary key in a linking table, if the combination of its foreign
> keys can do the job.

So when I have a table that exists only on the MANY-end of the
relation and in now way is ever to be used as a an entity in the
ONE-end of the relatonship.
There are no benefits to specifying a primary key if a combination of
two fields (that already have a unique not null constraint anyway) to
replace those with a primary key?

Are there any other benefits to a primary key other than unique not
null constraints.
For performance.
For example, if I create a primary key that is never used in any
query, but its just there'  to make the row unique.

Based on what you stated so far, I'd think:
A primary key is not necessary, but useful in uniquely identifying a record.

Thanks so far.

Regards,

Serge Fonville

--
http://www.sergefonville.nl

Convince Google!!
They need to support Adsense over SSL
https://www.google.com/adsense/support/bin/answer.py?hl=en&answer=10528
http://www.google.com/support/forum/p/AdSense/thread?tid=1884bc9310d9f923&hl=en

Re: are primary keys always 'needed'

From
Michael Glaesemann
Date:
On Feb 28, 2010, at 11:38 , Serge Fonville wrote:

> Hi,
>
> I was wondering if primary keys are always needed.
> What I understand;
> - The value of a primary key is unique per table
> - Primary keys are unique and not null.

Yes, in the sense that according to relational theory, a relation
(table) contains unique rows, each of which is (uniquely) identified
by its key. Note that a key is not necessarily a single attribute
(column): a key may be a "compound" key made up of a number of columns.

> I have a couple of tables.
> Some have a serial that is used as a foreign key in another table.
> Some tables consist of a combination of two foreign keys (that are
> unique together) and a field that is uniquely related to that
> combination (but is not necessarily unique within the table)
> Others have no unique field or combination at all

this "two foreign keys (that are unique together)" is an example of a
compound key.

>
> For the tables that are only used in the foreign part of the
> relationship(s), is there an added value for user of a surrogate
> primary key, which will never be referenced.

Most likely not. Note this question (is there added value of a
surrogate primary key?) is different from your initial question: (are
primary keys always needed?). Do not confuse primary keys with
surrogate keys (often generated by an auto-incrementing value such as
a serial). A surrogate is a meaningless attribute used only to
uniquely identify the row. If you *do* use a surrogate key, you should
also identify (and enforce) the "natural" (perhaps compound) key of
the table. Otherwise you may very well get duplicates of the "natural"
data even though the meaningless surrogate key is unique across the
table.

> I did a lot of googling and found very varying opinions.
> Mainly:
> - Always use a primary key, no reason why, it's just 'better'
> - Use primary keys when it makes sense.
>
> How do I determine what 'sense' or 'better' mean?

This depends on your application. Here are a few things to think about.

* Surrogate keys require joins or lookups to retrieve the actual data
values they represent.
* Surrogate keys require additional space in the table that holds the
actual data values and an extra index to enforce the uniqueness of
these values.
* Surrogate keys are often smaller in terms of size on disk than the
data they refer to. This counts for both tables including the
surrogate key and the indexes that include them.
* Integer surrogate keys used to represent text data are often faster
to look up, as integer comparisons are faster than string comparisons.

This is not an exhaustive list. Some people are vehemently opposed to
surrogate keys; others use them blindly. I think there are cases where
using surrogate keys does make sense, taking into account the trade-
offs when using them.

Hope this helps.

Michael Glaesemann
grzm seespotcode net




Re: are primary keys always 'needed'

From
"Eric Comeau"
Date:
"Michael Glaesemann" <grzm@seespotcode.net> wrote in message
news:E93B93F4-8616-4A23-8AC5-84749581E067@seespotcode.net...
>
> On Feb 28, 2010, at 11:38 , Serge Fonville wrote:

 snip

>> I did a lot of googling and found very varying opinions.
>> Mainly:
>> - Always use a primary key, no reason why, it's just 'better'
>> - Use primary keys when it makes sense.
>>
>> How do I determine what 'sense' or 'better' mean?
>
> This depends on your application. Here are a few things to think about.
>
> * Surrogate keys require joins or lookups to retrieve the actual data
> values they represent.
> * Surrogate keys require additional space in the table that holds the
> actual data values and an extra index to enforce the uniqueness of  these
> values.
> * Surrogate keys are often smaller in terms of size on disk than the  data
> they refer to. This counts for both tables including the  surrogate key
> and the indexes that include them.
> * Integer surrogate keys used to represent text data are often faster  to
> look up, as integer comparisons are faster than string comparisons.
>
> This is not an exhaustive list. Some people are vehemently opposed to
> surrogate keys; others use them blindly. I think there are cases where
> using surrogate keys does make sense, taking into account the trade- offs
> when using them.
>

If you are planning on using an ORM framework such as "Hibernate" on top of
this schema you will want to pay more attention to this PK issue (surrogate
vs. natural) as it was a pain for us to setup primary keys on multi-columns
initially. As well most ORM frameworks will want/require a PK on a table to
operate properly.

There was a post/link on tips to follow if using Hibernate/PostgreSQL which
I saved somewhere but can't easily find right now.

Regards, Eric



Re: are primary keys always 'needed'

From
Jasen Betts
Date:
On 2010-02-28, Serge Fonville <serge.fonville@gmail.com> wrote:
> Hi,
>
> I was wondering if primary keys are always needed.

The answer is no. but they are usually a very good idea.

> What I understand;
> - The value of a primary key is unique per table
> - Primary keys are unique and not null.
>
> I have a couple of tables.
> Some have a serial that is used as a foreign key in another table.
case (a)
> Some tables consist of a combination of two foreign keys (that are
> unique together) and a field that is uniquely related to that
> combination (but is not necessarily unique within the table)
case (b)
> Others have no unique field or combination at all
case (c)

> For the tables that are only used in the foreign part of the
> relationship(s), is there an added value for user of a surrogate
> primary key, which will never be referenced.

In case (b) you can use the pair of external key columns as the primary key.

> I did a lot of googling and found very varying opinions.

> - Always use a primary key, no reason why, it's just 'better'

Some software requires a single-column primary key.

> - Use primary keys when it makes sense.

> How do I determine what 'sense' or 'better' mean?

first you need to understand what the data and relationships represent.

If you never need to uniqely identify a record you don't need a
primary key.

consider the following table fragment

 mapref   | placename
 ---------+----------
 10,10      | New York
 10,10    | Noo Yawk
 10,10.01 | New York
 10,10    | New York

you may or may not want to allow data with repeats like that.
and if you want to delete or update only one of the ('10,10','New York')
rows you'll neet to take special care.