Re: serial column - Mailing list pgsql-general

From Brandon Aiken
Subject Re: serial column
Date
Msg-id F8E84F0F56445B4CB39E019EF67DACBA2EF527@exchsrvr.winemantech.com
Whole thread Raw
In response to Re: serial column  ("Brandon Aiken" <BAiken@winemantech.com>)
List pgsql-general

Yes, but if I tear down the house at 245 Main St, we don't renumber 247 Main St and on down the line, do we?

The problem here is that even if you get the sequencing to work, your table is dangerously unstable.  If you have 700 rows and you delete row #200, suddenly rows 201-700 are wrong.  That means you can't just lock the row you're working on.  You'll need to lock the whole table from INSERT/UPDATE/SELECT/DELETE until you've rebuilt the table because the whole thing is suddenly false.

I still believe the best method is going to be the linked list method I suggested, and it's the only one I can think of that meets  relational model requrements.  The problem with it is that while finding the first item (the one with NULL parent) and last item (the one with NULL child) are easy, and deleting any item is easy (parent becomes parent of child, delete record), and even inserting an item anywhere is easy (insert new record, new record becomes child of parent and parent of parent's child), it's more difficult to ask for item #4 in the order or item #261 in the order.  You need an index for your linked list, which I'm guessing is precisely the problem.  :)  External indices to linked lists is another thing an SQL database doesn't precisely handle very well, since it's all metadata and that adds to physical overhead.

So we return to the question: what purpose does this sequential order serve?  Why are gaps bad?  What problems are gaps causing?  Why does the database need to know the exact order?  Why can't your control code be aware of it instead?

You're asking the RDBMS to do something it was exactly designed *not* to do.  Rows are supposed to be unrelated objects or entries.  A table is *not* a tuple or an array.  Ordering them relates them, and makes your data less independent and your database less normalized.

In any case, I strongly recommend against using the ordering field as the primary key simply because you're planning to change them so much.  Make it a unique key to enforce the constraint, but primary keys should generally be very stable fields.

Brandon Aiken

-----Original Message-----
From: Bob Pawley [mailto:rjpawley@shaw.ca]
Sent: Mon 9/25/2006 11:59 AM
To: Brandon Aiken; pgsql-general@postgresql.org
Subject: Re: [GENERAL] serial column

Actually, I am not trying to "force keys" nor, I don't beleive, am I trying
to force an hierarchal structure within the database.

The numbers I want to assign to devices are nothing more than merely another
attribute of the device - perhaps akin to a number in a street address. The
problem, from my viewpoint, is that this attribute needs to always start at
1 and be sequential without gaps.

(I am however, partly relying on an hierarchal order within the database.
When I assign numbers to devices, the lowest number is assigned,
sequentially, to the device that has the lowest serial ID number. )

Thanks for your comments - everything helps at my stage.

Bob Pawley





----- Original Message -----
From: "Brandon Aiken" <BAiken@winemantech.com>
To: <pgsql-general@postgresql.org>
Sent: Monday, September 25, 2006 7:47 AM
Subject: Re: [GENERAL] serial column


I would tend to agree with Tom.

A table is by definition an unordered set of records.  Forcing keys to
have meaning of this type implies that there is a relationship between
each record in the set.  That's information you should be storing as
part of the record.  If order is important, design the database so that
it knows that order relationship exists.

An ordered list is just a hierarchal database wherein every record has
exactly one parent (or none if it's root) and exactly one child (or none
if it's end leaf), but the relational model does a rather poor job of
handling hierarchal relationships.  You might consider the two-way
linked list approach.  That is, each record knows the item before it and
the item after it, like so:

TABLE mainTable
{
id serial PRIMARY KEY,
foo text,
bar integer,
zen numeric
}

TABLE mainTableRelationships
{
parentID integer,
childID integer,
CONSTRAINT "mainTableRelationships_pkey" PRIMARY KEY
("parentID", "childID"),
CONSTRAINT "parentID_key" UNIQUE ("parentID"),
CONSTRAINT "childID_key" UNIQUE ("childID"),
CONSTRAINT "mainTable_parentID_fkey" FOREIGN KEY ("parentID")
      REFERENCES "mainTable" ("id"),
CONSTRAINT "mainTable_childID_fkey" FOREIGN KEY ("childID")
      REFERENCES "mainTable" ("id")
}

Of course, there's really little difference between doing things this
way and ordering by the SERIAL field and numbering them appropriately on
output, except that this above way is hideously more complex.

Another option would be to create a temporary table ordered correctly,
truncate the existing table, delete the sequence (or change the default
on the primary key), copy the data back, and then re-create the sequence
(or change default back to nextval) and then set nextval to MAX()+1.
This is rather ugly, however, since you're still forcing the database to
do relationships it doesn't know about, so you technically violate first
normal form by having a multi-valued field (it identifies uniqueness and
order).

--
Brandon Aiken
CS/IT Systems Engineer

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Sunday, September 24, 2006 7:31 PM
To: Bob Pawley
Cc: Ragnar; Postgresql
Subject: Re: [GENERAL] serial column

Bob Pawley <rjpawley@shaw.ca> writes:
> I am using the numbers to identify devices.
> If a device is deleted or replaced with another type of device I want
the
> numbering to still be sequential.

It sounds to me like you oughtn't be storing these numbers in the
database at all.  You just want to attach them at display time --- they
are certainly utterly meaningless as keys if they can change at any
moment.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq


pgsql-general by date:

Previous
From: "deep ..."
Date:
Subject: Postgresql reindex hangs when table is being updated
Next
From: Ralf Wiebicke
Date:
Subject: Re: in failed sql transaction