Home > mailing lists

Re: [HACKERS] Tree type, how best to impliment? - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: [HACKERS] Tree type, how best to impliment?
Date	November 17, 1998 18:04:32
Msg-id	4058.911343818@sss.pgh.pa.us Whole thread Raw
In response to	Re: [HACKERS] Tree type, how best to impliment? (Terry Mackintosh <terry@terrym.com>)
List	pgsql-hackers

Tree view

Terry Mackintosh <terry@terrym.com> writes:
> On Tue, 17 Nov 1998, Tom Lane wrote:
>> Why is "category, pcatid" unique?  This seems to constrain a parent
>> to have only one child per category value --- is that what you want?

> Yes.
> It is very much like a directory structure, one would not want two
> directories of the same name both off the same point in the file system.

Precisely...

>> If so, why not use the category code as the ID suffix, and not have to
>> bother with maintaining a next-ID counter?

> category is human readable, for display, the id is not, and when deciding
> what the next child's name should be, if not for the next-ID one would
> have to go count all the other records that have the same parent.

Why do you need a next ID at all?  Think directory structure.  Using
your example, this is perfectly valid:
                   Things                  /       \               Big        Small            /       \          Cars
  Boats

The index will work just as well (probably better) with key strings
like "Things/Big/Small" as with key strings like "1.1.2".  Moreover,
you don't need a separate index to enforce uniqueness, and you don't
need to update the parent row when adding a child.

You do need invented IDs if the category items are not necessarily
unique, but it seems your problem does not need that, so why complicate
the concept?

> Yes, I just was not sure how well indexes work with text fields?

I use 'em all the time...

> I had also though about one field only (text), where the categories would
> be all chained together delimited with slashes and have a PRIMARY KEY
> index.  That would automate by design all of the above problems.  But it
> would creat new ones:-)  Like for many deep records, the table would be
> much bigger.

If your higher-level nodes have long category names, then the space
savings from using an ID instead of a category name might become
interesting.  But if you were willing to use a fixed-size char(255)
(in fact two of 'em!) in every tuple for IDs, I don't think you can
complain about the average space cost of this approach...

Another way to approach this is to give each node a unique serial
number, and use the serial number as the child's back-link:
CREATE TABLE tab (    nodeID        int4    serial primary key,    parentID    int4    not null,  -- nodeID of parent
node   category    text    not null,    unique (parentID, category));

(I may not have the "serial" syntax quite right, but you get the idea.)
This approach is very compact, but a row's position in the hierarchy
is represented only indirectly --- you have to chase the parentID links
if you need to build the full pathname given just the nodeID.  You can't
lookup a row directly by its "category/subcategory/subsubcategory" path
either; you need a query for each level in order to fetch the parent
IDs.  But you needed that in your original design too, since there's no
other way to get the numeric IDs for subcategories.

A further space saving is to use the Postgres OIDs as the row
identifiers, but that makes it difficult to dump and reload the table,
so I don't recommend it.
        regards, tom lane

pgsql-hackers by date:

From: David Hartwig
Date: 17 November 1998, 15:58:53
Subject: Re: [HACKERS] PREPARE

From: Tom Lane
Date: 17 November 1998, 18:20:59
Subject: Re: [HACKERS] building 6.4 on sunos 4.1.4

Re: [HACKERS] Tree type, how best to impliment? - Mailing list pgsql-hackers

Previous

Next