Thread: question on serial key

question on serial key

From
Brandon Metcalf
Date:
This may be better discussed in the pgsql-sql forum.  Please let me
know if so.

I am looking for criteria on deciding whether or not to use a serial
(auto-incrementing) key for rows in a table.  For example, if I have a
table of, say, the elements on the periodic table I could use the
atomic number as a unique key which would not be a serial type.
Another example is where I'm inserting lots of rows in a given period
of time and I need to go back and select the max id of one of these
rows; I would need a serial key in this case.

Intuitively, it's pretty clear to me when a serial index is called
for.  Is there a succinct set of guidelines that one could go by?

Thanks.

--
Brandon

Re: question on serial key

From
Grzegorz Jaśkiewicz
Date:
you should use it, whenever you need db to keep its own key internally.
Advantage of sequence is also the fact, that you can have the sequence
value used on different columns/tables .

My rule of thumb is , in that case: as long as it is a short type (not
of toastable, or/and variable length), and as long as it won't change,
and is unique - I can use it. Otherwise, I use sequence to connect
rows internally for database.
First rule, is because of index access, and the way btree works.
Second is, because update of value will update other rows too - and
HOT won't help you here, so that's not efficient. And also, forcing it
to be unique is harder than.

Hth.

Re: question on serial key

From
Brandon Metcalf
Date:
g == gryzman@gmail.com writes:

 g> you should use it, whenever you need db to keep its own key internally.
 g> Advantage of sequence is also the fact, that you can have the sequence
 g> value used on different columns/tables .

 g> My rule of thumb is , in that case: as long as it is a short type (not
 g> of toastable, or/and variable length), and as long as it won't change,
 g> and is unique - I can use it. Otherwise, I use sequence to connect
 g> rows internally for database.
 g> First rule, is because of index access, and the way btree works.
 g> Second is, because update of value will update other rows too - and
 g> HOT won't help you here, so that's not efficient. And also, forcing it
 g> to be unique is harder than.

 g> Hth.

That does help.  So, in my example of a table consisting of rows for
each periodic table element, the atomic number would suffice as a
unique key since, well, it's unique and not going to change.  Right?

--
Brandon

Re: question on serial key

From
Grzegorz Jaśkiewicz
Date:
2009/5/22 Brandon Metcalf <brandon@geronimoalloys.com>:

> That does help.  So, in my example of a table consisting of rows for
> each periodic table element, the atomic number would suffice as a
> unique key since, well, it's unique and not going to change.  Right?

Well, yes :)
At least that's my opinion and experience.

Now, having said that, you're good until physics doesn't question some
of the math there, and doesn't change that number for some reason ;)



--
GJ

Re: question on serial key

From
Sam Mason
Date:
On Fri, May 22, 2009 at 08:41:46AM -0500, Brandon Metcalf wrote:
> I am looking for criteria on deciding whether or not to use a serial
> (auto-incrementing) key for rows in a table.

Wow, that's the second time today someone asked that!

> Intuitively, it's pretty clear to me when a serial index is called
> for.  Is there a succinct set of guidelines that one could go by?

Not that I'm aware of; it's a fuzzy design choice with benefits and
costs for either option.  There are lots of people who arbitrarily
pick one side which tends to make things worse, using one or the other
*exclusively* will add complication.  General terms to search for are
Natural keys vs. Surrogate keys.

--
  Sam  http://samason.me.uk/

Re: question on serial key

From
"Roderick A. Anderson"
Date:
Brandon Metcalf wrote:
> g == gryzman@gmail.com writes:
>
>  g> you should use it, whenever you need db to keep its own key internally.
>  g> Advantage of sequence is also the fact, that you can have the sequence
>  g> value used on different columns/tables .
>
>  g> My rule of thumb is , in that case: as long as it is a short type (not
>  g> of toastable, or/and variable length), and as long as it won't change,
>  g> and is unique - I can use it. Otherwise, I use sequence to connect
>  g> rows internally for database.
>  g> First rule, is because of index access, and the way btree works.
>  g> Second is, because update of value will update other rows too - and
>  g> HOT won't help you here, so that's not efficient. And also, forcing it
>  g> to be unique is harder than.
>
>  g> Hth.
>
> That does help.  So, in my example of a table consisting of rows for
> each periodic table element, the atomic number would suffice as a
> unique key since, well, it's unique and not going to change.  Right?

Well you never know.  They took planet status away from Pluto.  :-)


\\||/
Rod
--



Re: question on serial key

From
Grzegorz Jaśkiewicz
Date:
On Fri, May 22, 2009 at 3:26 PM, Roderick A. Anderson
<raanders@cyber-office.net> wrote:
> Brandon Metcalf wrote:
>> That does help.  So, in my example of a table consisting of rows for
>> each periodic table element, the atomic number would suffice as a
>> unique key since, well, it's unique and not going to change.  Right?
>
> Well you never know.  They took planet status away from Pluto.  :-)

*g*


--
GJ

Re: question on serial key

From
Brandon Metcalf
Date:
s == sam@samason.me.uk writes:

 s> On Fri, May 22, 2009 at 08:41:46AM -0500, Brandon Metcalf wrote:
 s> > I am looking for criteria on deciding whether or not to use a serial
 s> > (auto-incrementing) key for rows in a table.

 s> Wow, that's the second time today someone asked that!

 s> > Intuitively, it's pretty clear to me when a serial index is called
 s> > for.  Is there a succinct set of guidelines that one could go by?

 s> Not that I'm aware of; it's a fuzzy design choice with benefits and
 s> costs for either option.  There are lots of people who arbitrarily
 s> pick one side which tends to make things worse, using one or the other
 s> *exclusively* will add complication.  General terms to search for are
 s> Natural keys vs. Surrogate keys.

The search terms help.  I wasn't searching for the right thing and
finding very little information.

--
Brandon

Re: question on serial key

From
Grzegorz Jaśkiewicz
Date:
On Fri, May 22, 2009 at 3:33 PM, Sam Mason <sam@samason.me.uk> wrote:
> Not that I'm aware of; it's a fuzzy design choice with benefits and
> costs for either option.  There are lots of people who arbitrarily
> pick one side which tends to make things worse, using one or the other
> *exclusively* will add complication.  General terms to search for are
> Natural keys vs. Surrogate keys.

Yes, it is always worth reading on subject. But that's pure theory,
now there are also database specific concerns. Hence my 'own
experience' suggestion. :)



--
GJ

Re: question on serial key

From
Scott Marlowe
Date:
On Fri, May 22, 2009 at 9:04 AM, Brandon Metcalf
<brandon@geronimoalloys.com> wrote:
> s == sam@samason.me.uk writes:
>
>  s> On Fri, May 22, 2009 at 08:41:46AM -0500, Brandon Metcalf wrote:
>  s> > I am looking for criteria on deciding whether or not to use a serial
>  s> > (auto-incrementing) key for rows in a table.
>
>  s> Wow, that's the second time today someone asked that!
>
>  s> > Intuitively, it's pretty clear to me when a serial index is called
>  s> > for.  Is there a succinct set of guidelines that one could go by?
>
>  s> Not that I'm aware of; it's a fuzzy design choice with benefits and
>  s> costs for either option.  There are lots of people who arbitrarily
>  s> pick one side which tends to make things worse, using one or the other
>  s> *exclusively* will add complication.  General terms to search for are
>  s> Natural keys vs. Surrogate keys.
>
> The search terms help.  I wasn't searching for the right thing and
> finding very little information.

The periodic table of the elements, state names, etc are all the kind
of data used in what I call lookup tables.  They tend to be static,
and are used to ensure that the rest of the database are using the
proper values.  In these cases it's almost always best to use the
natural key, and FK to that from another table.

OTOH, if you've got things like customer records, and there are
millions of them, it's often best to use a surrogate key because it's
usually smaller and provides better performance where it counts.  A
well designed database will often use both types of keys, because they
solve different problems in terms of performance, durability,
abstraction, etc.

Re: question on serial key

From
Jasen Betts
Date:
On 2009-05-22, Brandon Metcalf <brandon@geronimoalloys.com> wrote:
> g == gryzman@gmail.com writes:
>
>  g> you should use it, whenever you need db to keep its own key internally.
>  g> Advantage of sequence is also the fact, that you can have the sequence
>  g> value used on different columns/tables .
>
>  g> My rule of thumb is , in that case: as long as it is a short type (not
>  g> of toastable, or/and variable length), and as long as it won't change,
>  g> and is unique - I can use it. Otherwise, I use sequence to connect
>  g> rows internally for database.
>  g> First rule, is because of index access, and the way btree works.
>  g> Second is, because update of value will update other rows too - and
>  g> HOT won't help you here, so that's not efficient. And also, forcing it
>  g> to be unique is harder than.
>
>  g> Hth.
>
> That does help.  So, in my example of a table consisting of rows for
> each periodic table element, the atomic number would suffice as a
> unique key since, well, it's unique and not going to change.  Right?

Well, until some fool* wants to insert a row for deuterium into your
table, and finds that spot taken by hydrogen.

If you can guarantee that you chosen natural key is sufficient and i
s not going to give trouble in the case of marriages, isomerism, twin
birth, isotopes, centegenarians, or some other condition that makes a
mockery of your chosen key then go for it.

*assuming you want it to be fool-proof.