Re: double linked list - Mailing list pgsql-sql
From | jasche@gmx.de (Juergen) |
---|---|
Subject | Re: double linked list |
Date | |
Msg-id | 2ee07259.0301290412.5eb3290a@posting.google.com Whole thread Raw |
In response to | Re: double linked list (71062.1056@compuserve.com (--CELKO--)) |
List | pgsql-sql |
heavy stuff Celko. I would lie if I would pretend I fully understand Your answer. I'll let sink it in. However, I dont store a consistent tree structure. The table at hand is more a kind of a collection of graphs where I want to find all possible paths between a given starting point and a given end point cheers Juergen 71062.1056@compuserve.com (--CELKO--) wrote in message news:<c0d87ec0.0301281554.581797ee@posting.google.com>... > >> I've got a table called 'link_t' containing a collection of seller > - > buyer relations between two parties. << > > That is not a real linked list, but let's ignore bad terminology. One > way to do this is with cursors, but they will take time and trend to > be proprietary. > > Anohter way is to build a tree, with the first seller as the root and > the final buyer as a leaf node. > > The usual example of a tree structure in SQL books is called an > adjacency list model and it looks like this: > > CREATE TABLE OrgChart > (emp CHAR(10) NOT NULL PRIMARY KEY, > boss CHAR(10) DEFAULT NULL REFERENCES OrgChart(emp), > salary DECIMAL(6,2) NOT NULL DEFAULT 100.00); > > OrgChart > emp boss salary > =========================== > 'Albert' 'NULL' 1000.00 > 'Bert' 'Albert' 900.00 > 'Chuck' 'Albert' 900.00 > 'Donna' 'Chuck' 800.00 > 'Eddie' 'Chuck' 700.00 > 'Fred' 'Chuck' 600.00 > > Another way of representing trees is to show them as nested sets. > Since SQL is a set oriented language, this is a better model than the > usual adjacency list approach you see in most text books. Let us > define a simple OrgChart table like this, ignoring the left (lft) and > right (rgt) columns for now. This problem is always given with a > column for the employee and one for his boss in the textbooks. This > table without the lft and rgt columns is called the adjacency list > model, after the graph theory technique of the same name; the pairs of > emps are adjacent to each other. > > CREATE TABLE OrgChart > (emp CHAR(10) NOT NULL PRIMARY KEY, > lft INTEGER NOT NULL UNIQUE CHECK (lft > 0), > rgt INTEGER NOT NULL UNIQUE CHECK (rgt > 1), > CONSTRAINT order_okay CHECK (lft < rgt) ); > > OrgChart > emp lft rgt > ====================== > 'Albert' 1 12 > 'Bert' 2 3 > 'Chuck' 4 11 > 'Donna' 5 6 > 'Eddie' 7 8 > 'Fred' 9 10 > > The organizational chart would look like this as a directed graph: > > Albert (1,12) > / \ > / \ > Bert (2,3) Chuck (4,11) > / | \ > / | \ > / | \ > / | \ > Donna (5,6) Eddie (7,8) Fred (9,10) > > The first table is denormalized in several ways. We are modeling both > the OrgChart and the organizational chart in one table. But for the > sake of saving space, pretend that the names are job titles and that > we have another table which describes the OrgChart that hold those > positions. > > Another problem with the adjacency list model is that the boss and > employee columns are the same kind of thing (i.e. names of OrgChart), > and therefore should be shown in only one column in a normalized > table. To prove that this is not normalized, assume that "Chuck" > changes his name to "Charles"; you have to change his name in both > columns and several places. The defining characteristic of a > normalized table is that you have one fact, one place, one time. > > The final problem is that the adjacency list model does not model > subordination. Authority flows downhill in a hierarchy, but If I fire > Chuck, I disconnect all of his subordinates from Albert. There are > situations (i.e. water pipes) where this is true, but that is not the > expected situation in this case. > > To show a tree as nested sets, replace the emps with ovals, then nest > subordinate ovals inside each other. The root will be the largest oval > and will contain every other emp. The leaf emps will be the innermost > ovals with nothing else inside them and the nesting will show the > hierarchical relationship. The rgt and lft columns (I cannot use the > reserved words LEFT and RIGHT in SQL) are what shows the nesting. > > If that mental model does not work, then imagine a little worm > crawling anti-clockwise along the tree. Every time he gets to the left > or right side of a emp, he numbers it. The worm stops when he gets all > the way around the tree and back to the top. > > This is a natural way to model a parts explosion, since a final > assembly is made of physically nested assemblies that final break down > into separate parts. > > At this point, the boss column is both redundant and denormalized, so > it can be dropped. Also, note that the tree structure can be kept in > one table and all the information about a emp can be put in a second > table and they can be joined on employee number for queries. > > To convert the graph into a nested sets model think of a little worm > crawling along the tree. The worm starts at the top, the root, makes a > complete trip around the tree. When he comes to a emp, he puts a > number in the cell on the side that he is visiting and increments his > counter. Each emp will get two numbers, one of the right side and one > for the left. Computer Science majors will recognize this as a > modified preorder tree traversal algorithm. Finally, drop the unneeded > OrgChart.boss column which used to represent the edges of a graph. > > This has some predictable results that we can use for building > queries. The root is always (left = 1, right = 2 * (SELECT COUNT(*) > FROM TreeTable)); leaf emps always have (left + 1 = right); subtrees > are defined by the BETWEEN predicate; etc. Here are two common queries > which can be used to build others: > > 1. An employee and all their Supervisors, no matter how deep the tree. > > SELECT O2.* > FROM OrgChart AS O1, OrgChart AS O2 > WHERE O1.lft BETWEEN O2.lft AND O2.rgt > AND O1.emp = :myemployee; > > 2. The employee and all subordinates. There is a nice symmetry here. > > SELECT O1.* > FROM OrgChart AS O1, OrgChart AS O2 > WHERE O1.lft BETWEEN O2.lft AND O2.rgt > AND O2.emp = :myemployee; > > 3. Add a GROUP BY and aggregate functions to these basic queries and > you have hierarchical reports. For example, the total salaries which > each > employee controls: > > SELECT O2.emp, SUM(S1.salary) > FROM OrgChart AS O1, OrgChart AS O2, > Salaries AS S1 > WHERE O1.lft BETWEEN O2.lft AND O2.rgt > AND O1.emp = S1.emp > GROUP BY O2.emp; > > 4. To find the level of each emp, so you can print the tree as an > indented listing. Technically, you should declare a cursor to go with > the ORDER BY clause. > > SELECT COUNT(O2.emp) AS indentation, O1.emp > FROM OrgChart AS O1, OrgChart AS O2 > WHERE O1.lft BETWEEN O2.lft AND O2.rgt > GROUP BY O1.lft. O1.emp > ORDER BY O1.lft; > > 5. The nested set model has an implied ordering of siblings which > theadjacency list model does not. To insert a new node, G1, under part > G. We can insert one node at a time like this: > > BEGIN ATOMIC > DECLARE rightmost_spread INTEGER; > > SET rightmost_spread > = (SELECT rgt > FROM Frammis > WHERE part = 'G'); > UPDATE Frammis > SET lft = CASE WHEN lft > rightmost_spread > THEN lft + 2 > ELSE lft END, > rgt = CASE WHEN rgt >= rightmost_spread > THEN rgt + 2 > ELSE rgt END > WHERE rgt >= rightmost_spread; > > INSERT INTO Frammis (part, lft, rgt) > VALUES ('G1', rightmost_spread, (rightmost_spread + 1)); > COMMIT WORK; > END; > > The idea is to spread the lft and rgt numbers after the youngest child > of the parent, G in this case, over by two to make room for the new > addition, G1. This procedure will add the new node to the rightmost > child position, which helps to preserve the idea of an age order among > the siblings. > > 6. To convert a nested sets model into an adjacency list model: > > SELECT B.emp AS boss, P.emp > FROM OrgChart AS P > LEFT OUTER JOIN > OrgChart AS B > ON B.lft > = (SELECT MAX(lft) > FROM OrgChart AS S > WHERE P.lft > S.lft > AND P.lft < S.rgt); > > 7. To convert an adjacency list to a nested set model, use a push down > stack. Here is version with a stack in SQL/PSM. > > -- Tree holds the adjacency model > CREATE TABLE Tree > (node CHAR(10) NOT NULL, > parent CHAR(10)); > > -- Stack starts empty, will holds the nested set model > CREATE TABLE Stack > (stack_top INTEGER NOT NULL, > node CHAR(10) NOT NULL, > lft INTEGER, > rgt INTEGER); > > BEGIN ATOMIC > DECLARE counter INTEGER; > DECLARE max_counter INTEGER; > DECLARE current_top INTEGER; > > SET counter = 2; > SET max_counter = 2 * (SELECT COUNT(*) FROM Tree); > SET current_top = 1; > > --clear the stack > DELETE FROM Stack; > > -- push the root > INSERT INTO Stack > SELECT 1, node, 1, max_counter > FROM Tree > WHERE parent IS NULL; > > -- delete rows from tree as they are used > DELETE FROM Tree WHERE parent IS NULL; > > WHILE counter <= max_counter- 1 > IF EXISTS (SELECT * > FROM Stack AS S1, Tree AS T1 > WHERE S1.node = T1.parent > AND S1.stack_top = current_top) > > BEGIN -- push when top has subordinates and set lft value > INSERT INTO Stack > SELECT (current_top + 1), MIN(T1.node), counter, NULL > FROM Stack AS S1, Tree AS T1 > WHERE S1.node = T1.parent > AND S1.stack_top = current_top; > > -- delete rows from tree as they are used > DELETE FROM Tree > WHERE node = (SELECT node > FROM Stack > WHERE stack_top = current_top + 1); > -- housekeeping of stack pointers and counter > SET counter = counter + 1; > SET current_top = current_top + 1; > END > ELSE > BEGIN -- pop the stack and set rgt value > UPDATE Stack > SET rgt = counter, > stack_top = -stack_top -- pops the stack > WHERE stack_top = current_top > SET counter = counter + 1; > SET current_top = current_top - 1; > END; > END IF; > -- the top column is not needed in the final answer > SELECT node, lft, rgt FROM Stack; > END; > > I have a book on TREES & HIERARCHIES IN SQL coming out in 2003, but in > the meantime, you can look at: > > http://searchdatabase.techtarget.com/tip/1,289483,sid13_gci537290,00.html > > http://searchdatabase.techtarget.com/tip/1,289483,sid13_gci801943,00.html