Re: How to represent a tree-structure in a relational database - Mailing list pgsql-sql
From | Tulassay Zsolt |
---|---|
Subject | Re: How to represent a tree-structure in a relational database |
Date | |
Msg-id | Pine.LNX.4.10.10012141346220.10051-100000@tek.tek.bke.hu Whole thread Raw |
In response to | Re: How to represent a tree-structure in a relational database (Josh Berkus <josh@agliodbs.com>) |
Responses |
Re: How to represent a tree-structure in a relational database
|
List | pgsql-sql |
There actually is a model of tree structures in SQL databases which is different from those mentioned earlier in that it represents the tree as nested sets (ie. nodes are subsets of parent sets (parent nodes)). There is a huge advantage in this model as it eliminates the need for recursion. For example, if you want to get a specific node's parents, grandparents and all other parents up the tree, you can do this in _one single_ SELECT query. If you just stored the the id of the parent of each node, you would need (n-1) queries if the node is at the n-th level. However, inserting or deleting nodes is more complex this way since you have to keep the data structure "balanced". But in most cases you won't insert new nodes all the time so you won't suffer from this overhead. And on the other hand, the performance gain on queries might be enormous. You can find the article dealing with this at http://www.utdt.edu/~mig/sql-trees Actually what I did was to implement _both_ models at the same time. I took the nested set structure, and besides stored the parent id of all nodes as well. The benefit of this is that by accepting a little more overhead during inserts, I was able to use the advantages of both models. If somebody is interested in it, I might share the code (a few tables and some plpgsql functions). But the article is surely worth a read. Zsolt ps: I found the link a few months ago in the pgsql-sql archive :-) On Wed, 13 Dec 2000, Josh Berkus wrote: > Stuart, > > > I don't think I'd be comfortable with having the node_level column in the > > table structure. First, because you can derive that value using a function, > > it's duplicate data. Second, if you decide to take an entire segment of your > > hierarchy and move it under another node (by changing the value of > > node_linkup/ParentCategoryID), you'll need to recalculate all of those > > node_level values. And all the node_level values underneath it. > > I can see that. I suppose it depends on the data you're storing. The > project I was working on tracked grocery inventory for a delivery > service, and thus each item had a fixed "level" in the heirarcy (Food > Class, Food Type, Manufacturer, and Item) and thus while items might get > reassigned *across* the heirarcy, they did not get re-assigned *up and > down* the heirarcy. > > Also, I can't think of a way to represent the tree in pure SQL without > having the level identifiers (and a fixed number of levels). > > > We've done a similar thing for Java. It was ridiculously easy to create a > > TreeModel wrapped around this data. Almost too easy; it made me feel dirty. > > Great. Maybe I'll buy it from you if I ever need to use Java :-) > > -Josh > > -- > ______AGLIO DATABASE SOLUTIONS___________________________ > Josh Berkus > Complete information technology josh@agliodbs.com > and data management solutions (415) 436-9166 > for law firms, small businesses fax 436-0137 > and non-profit organizations. pager 338-4078 > San Francisco >