Thread: SQL trees and other nonsense...

SQL trees and other nonsense...

From
"Trilobite Trilobite"
Date:
First, If there are any developers listening... Thank you so much for your
work.  I love everything about PG.  It makes the life of a db enthusiast so
sweet (otherwise, it would be an expensive habit).  Also, I'm helping
companies move to a better db and cut their costs.  It's truly one of the
great open source projects out there.  Good - now I got that off my chest.

Now the issue at hand:

I'm playing with tree structures in SQL.  It's kinda cool and definitely a
big hack.

Anyway, there are a few things in our database that are more hierarchal then
they are relational.  The problem I'm working with is accounting, as in,
"accounts > owners equity > expense accounts > rent > shop rent" etc...  I
have no idea how many accounts the end user will set up and I have no idea
about their structure.

In this case I'm making a base table to be inherited from any other tables
that need to store information in a hierarchy.  Also, there is not just one
big branch.  There are many trees each with its own root.  Here is the
definition of the base table "trees":


CREATE TABLE trees
(
    id    SERIAL    PRIMARY KEY,
    root    INTEGER    NULL,
    low    INTEGER    NOT NULL,
    high    INTEGER    NOT NULL,
    CONSTRAINT low_less_then_high CHECK (low < high),
    CONSTRAINT low_grater_then_zero CHECK (low > 0),
    CONSTRAINT high_greater_than_one CHECK (high > 1),
    CONSTRAINT root_must_link_to_id FOREIGN KEY (root) REFERENCES trees ON
DELETE CASCADE ON UPDATE CASCADE
);


db=# SELECT * from trees;

id | root | low | high
----+------+-----+------
1  | NULL |   1 |    8
2  |    1 |   2 |    3
3  |    1 |   4 |    5
4  |    1 |   6 |    7
(4 rows)

Visualization of data:

             1
         (1  N  8)
        /    |    \
       /     |     \
      /      |      \
     2       |       4
(2  1  3)   |   (6  1  7)
             3
         (4  1  5)


Another example where I might use this is in a threaded conversation.  So I
would just create a table for conversations
and accounts and add the additional information needed:


CREATE TABLE conversations
(
    body    TEXT    NOT NULL
)INHERITS(trees);

and...

CREATE TABLE accounts
(
    name    TEXT    NOT NULL,
    opened    DATE    NOT NULL

)INHERITS(trees);


I plan on writing functions to insert nodes into the tree, remove them, drop
the whole branch, etc... because changes to one node might affect the
others.  But, I though I could save myself some function writing if I could
make a self-referencing table.  So if the "root" is deleted or changed FKs
would take care of themselves.  The problem is that it doesn't seem to work
like I thought it would.  So I'm left with things like:


CREATE OR REPLACE FUNCTION forest.prune_branch(INTEGER) RETURNS VOID AS
'
DECLARE
    node_id        ALIAS FOR $1;
    node_root    INTEGER;
    node_low    INTEGER;
    node_high    INTEGER;
    less_count    INTEGER;
BEGIN
    SELECT INTO node_root, node_low, node_high root, low, high
    FROM base.trees
    WHERE id = node_id;

    -- Make sure it is there before moving on...
    IF NOT FOUND THEN
        RAISE EXCEPTION \'Branch not found at %!\', node_id;
        RETURN;
    END IF;

    -- RAISE INFO \'Node = %, Root = %, Low = %, High = %\', node_id,
node_root, node_low, node_high;

    -- If the node is a root then do a quick and dirty delete...
    IF (node_root IS NULL) AND (node_low = 1) THEN
        DELETE FROM base.trees WHERE root = node_id;
        DELETE FROM base.trees WHERE id = node_id;
        RETURN;
    END IF;

    -- Not a root, do it the hard way...
    less_count := (node_high - node_low + 1);

    DELETE FROM base.trees
    WHERE low BETWEEN node_low AND node_high AND root = node_root;

    UPDATE base.trees SET
        low =    CASE WHEN low > node_low
            THEN low - less_count
            ELSE low
        END,
        high =  CASE WHEN high > node_low
            THEN high - less_count
            ELSE high
        END
    WHERE root = node_root OR id = node_root;
    RETURN;
END;
'
LANGUAGE 'plpgsql';

While a function like this is needed if you are deleting a node from the
middle of a branch, it shouldn't be necessary to delete a node where it is a
"root".

So I guess what I'm wondering is:

1) Anything wrong with my approach?
2) Should I forget about the FK constraints on the base table - seeing as
they don't seem to work?
3) If I did go with a collection of functions to maintain the tree, how
would I enforce their use (restrict dels, ins, and ups) without restricting
the functions?
4) What other ideas do you all have?

Thanks in advance for your input and sorry if this is a poorly stated
question but it is different enough from standard SQL that I thought "the
more info the better".

Later,
Me

_________________________________________________________________
Get reliable access on MSN 9 Dial-up. 3 months for the price of 1!
(Limited-time offer)
http://join.msn.com/?page=dept/dialup&pgmarket=en-us&ST=1/go/onm00200361ave/direct/01/


Re: SQL trees and other nonsense...

From
William White
Date:
Trilobite Trilobite wrote:

> Anyway, there are a few things in our database that are more hierarchal
> then they are relational.  The problem I'm working with is accounting,
> as in, "accounts > owners equity > expense accounts > rent > shop rent"
> etc...  I have no idea how many accounts the end user will set up and I
> have no idea about their structure.

Perhaps I'm missing something obvious, but ... my understanding of the
above is that you're saying that some accounts are owners equity
accounts, some owners equity accounts are expense accounts, some expense
accounts are rent, etc. ... and you're trying to describe this sort of
relationship in SQL.  Is this correct?

If so why not just make a "base" relvar called 'accounts', e.g.,

CREATE TABLE account
(
     id SERIAL PRIMARY KEY,
     foo CHAR(64),
     bar CHAR(64)
);

which e.g. might have id entries 1-20,

then extend via relation with a "derived" relvar called
'expense_account', e.g.,

CREATE TABLE expense_account
(
     id INT REFERENCES account(id),
     baz CHAR(64)
);

which e.g. might have entries at ids 1,3,11, and 18.

Then an account t1 is an expense account iff there exists some t2 in
expense_accounts such that t1.id = t2.id.

(Please excuse my SQL, by the way, I've been using CJ Date notation in
almost exclusively for the last month or two)

I wish I could help you with the more general self-referencing issue.
Every time that one's come up for me, I've redesigned to a strictly
relational model and avoided the problem entirely.

-- Bill

Re: SQL trees and other nonsense...

From
David Garamond
Date:
Trilobite Trilobite wrote:
> I'm playing with tree structures in SQL.  It's kinda cool and definitely
> a big hack.
>
> Anyway, there are a few things in our database that are more hierarchal
> then they are relational.  The problem I'm working with is accounting,
> as in, "accounts > owners equity > expense accounts > rent > shop rent"

Since when does Expense become a subcategory of Equity? :)

Btw, trees have been discussed so many times. Peruse the archive and
you'll find many bits of wisdom...

As for meself, currently I tend to use MP (materialized path). It's
space expensive but most of my trees are not very deep and MP makes most
queries easy.

After all, NS (nested set) is a special form of MP anyway.

--
dave