Thread: how to use recursion to find end nodes of a tree
Hello All, I have been having a really hard time trying to come up with a pl/pgsql recursive function to returns the end nodes of a tree. Here is an example table definition: CREATE TABLE parent_child ( parent_id integer NOT NULL, child_id integer NOT NULL ); INSERT INTO parent_child (parent_id, child_id) VALUES (1, 2); INSERT INTO parent_child (parent_id, child_id) VALUES (1, 3); INSERT INTO parent_child (parent_id, child_id) VALUES (1, 4); INSERT INTO parent_child (parent_id, child_id) VALUES (2, 5); INSERT INTO parent_child (parent_id, child_id) VALUES (2, 6); INSERT INTO parent_child (parent_id, child_id) VALUES (4, 7); INSERT INTO parent_child (parent_id, child_id) VALUES (4, 8); INSERT INTO parent_child (parent_id, child_id) VALUES (4, 9); INSERT INTO parent_child (parent_id, child_id) VALUES (9, 10); This produces the following tree of data: 1 ___|___ | | | 2 3 4 _|_ _|_| | | | |5 6 7 8 9 | 10 I want to create a function that returns the terminating nodes of of this tree below a certain level i.e. if I input 1 to the function I need it to return 5,6,3,7,8,10. If I input 4 to the function I would get 7,8,10. I have written recursive functions which return all nodes on a branch of a tree but I can't think of a way to return the end nodes does anyone know of a solution? Many thanks, Mike
> Hello All, > > I have been having a really hard time trying to come up with a pl/pgsql > recursive function to returns the end nodes of a tree. > Here is an example table definition: > > CREATE TABLE parent_child ( > parent_id integer NOT NULL, > child_id integer NOT NULL > ); > > INSERT INTO parent_child (parent_id, child_id) VALUES (1, 2); > INSERT INTO parent_child (parent_id, child_id) VALUES (1, 3); > INSERT INTO parent_child (parent_id, child_id) VALUES (1, 4); > INSERT INTO parent_child (parent_id, child_id) VALUES (2, 5); > INSERT INTO parent_child (parent_id, child_id) VALUES (2, 6); > INSERT INTO parent_child (parent_id, child_id) VALUES (4, 7); > INSERT INTO parent_child (parent_id, child_id) VALUES (4, 8); > INSERT INTO parent_child (parent_id, child_id) VALUES (4, 9); > INSERT INTO parent_child (parent_id, child_id) VALUES (9, 10); > > This produces the following tree of data: > > 1 > ___|___ > | | | > 2 3 4 > _|_ _|_ > | | | | | > 5 6 7 8 9 > | > 10 > > I want to create a function that returns the terminating nodes of > of this tree below a certain level i.e. if I input 1 to the function > I need it to return 5,6,3,7,8,10. If I input 4 to the function I would > get 7,8,10. I have written recursive functions which return all nodes > on a branch of a tree but I can't think of a way to return the end nodes > does anyone know of a solution? > I haven't programmed in PL/pgSQL in a while, but I'll write some pseudo code. I think the code should be similar: func(int node) { dynamic_array s; dynamic_array leaves; int top, count, leaf_id, popped, child; leaf_id = top = 0; s[top] = node; count = 1; // to a depth first search while(count != 0) { popped = s[top]; top--; count--; foreach(select pc.child_id into child from parent_child pc where pc.parent_id = popped) { select* from parect_child pc where parent_id = child; // a count of zero indicates that child node has no children if(count_of_above_query = 0) { leaves[leaf_id] = child; leaf_id++; } else { // not a leaf, so add it to the stackfor the next time through // the loop top++; s[top] = child; count++; } } } return leaves; } Regards, Yasir
On Mon, 2006-04-10 at 16:09 +0100, mike@mikeandems.com wrote: > Hello All, > > I have been having a really hard time trying to come up with a pl/pgsql > recursive function to returns the end nodes of a tree. > Here is an example table definition: > > CREATE TABLE parent_child ( > parent_id integer NOT NULL, > child_id integer NOT NULL > ); > > INSERT INTO parent_child (parent_id, child_id) VALUES (1, 2); > INSERT INTO parent_child (parent_id, child_id) VALUES (1, 3); > INSERT INTO parent_child (parent_id, child_id) VALUES (1, 4); > INSERT INTO parent_child (parent_id, child_id) VALUES (2, 5); > INSERT INTO parent_child (parent_id, child_id) VALUES (2, 6); > INSERT INTO parent_child (parent_id, child_id) VALUES (4, 7); > INSERT INTO parent_child (parent_id, child_id) VALUES (4, 8); > INSERT INTO parent_child (parent_id, child_id) VALUES (4, 9); > INSERT INTO parent_child (parent_id, child_id) VALUES (9, 10); > What you appear to have is really this, with a missing first node: CREATE TABLE parent_child (parent_id integer NOT NULL,this_node_id integer NULL, ); INSERT INTO parent_child (parent_id, this_node_id) VALUES (0, 1); INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 2); INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 3); INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 4); INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 5); INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 6); INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 7); INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 8); INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 9); INSERT INTO parent_child (parent_id, this_node_id) VALUES (9, 10); This makes it easy to search from leaf to root, but not from root to leaf. Without a list of child_ids in each node you must search the whole table for nodes that have the current node id as their parent_id in order to determine if a node is a leaf node or not. Perhaps you can include a child_id[] in each node, or a has_children boolean flag that you set and unset when inserting or deleting rows. But perhaps you can get PostgreSQL to do it for you by setting this_node_id as primary key and parent_id as foreign key referencing this same table. You could then test if it's a leaf node by attempting to change the node's this_node_id to some out-of-range value and see if it produces as error. If no error then it's a leaf node, (then you must restore this_node_id - I would try just negating it for the test, so I don't have to actually store the original value somewhere). > This produces the following tree of data: > > 1 > ___|___ > | | | > 2 3 4 > _|_ _|_ > | | | | | > 5 6 7 8 9 > | > 10 > > I want to create a function that returns the terminating nodes of > of this tree below a certain level i.e. if I input 1 to the function > I need it to return 5,6,3,7,8,10. If I input 4 to the function I would > get 7,8,10. I have written recursive functions which return all nodes > on a branch of a tree but I can't think of a way to return the end nodes > does anyone know of a solution? > > Many thanks, > > Mike > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org >
Thankyou very much Yasir and Ross for your help and advice. I have created a pl/pgsql version of Yasirs algorithm which works perfectly, I am also looking into improving efficiency by flaging leaf records. Here is my pl/pgsql solution in case it helps anyone out: CREATE OR REPLACE FUNCTION parentchildtest(int4) RETURNS _int4 AS 'DECLARE node ALIAS FOR $1; s INTEGER[]; leaves INTEGER[]; top INTEGER; counter INTEGER; leaf_id INTEGER; popped INTEGER; child RECORD; childCount RECORD; BEGIN leaf_id := 0; top := 0; s := ''{}''; leaves := ''{}''; s[top] := node; counter := 1; -- t a depthfirst search WHILE (counter <> 0) LOOP popped := s[top];top := top - 1;counter := counter - 1; FOR child IN SELECT pc.child_id FROM parent_child AS pc WHERE pc.parent_id = poppedLOOP SELECT INTO childCount COUNT(*)AS count FROM parent_child AS pc WHERE pc.parent_id = child.child_id; --a count of zero indicates that child node has no children IF (childCount.count = 0) THEN leaves[leaf_id]= child.child_id; leaf_id := leaf_id + 1; ELSE -- not a leaf, so add it to the stack for thenext time through -- the loop top := top + 1; s[top] = child.child_id; counter := counter + 1; END IF;END LOOP; ENDLOOP; RETURN leaves; END;' LANGUAGE 'plpgsql' VOLATILE; -----Original Message----- From: Yasir Malik [mailto:ymalik@cs.stevens.edu] Sent: 10 April 2006 17:13 To: mike@mikeandems.com Cc: pgsql-sql@postgresql.org Subject: Re: [SQL] how to use recursion to find end nodes of a tree > Hello All, > > I have been having a really hard time trying to come up with a > pl/pgsql recursive function to returns the end nodes of a tree. Here > is an example table definition: > > CREATE TABLE parent_child ( > parent_id integer NOT NULL, > child_id integer NOT NULL > ); > > INSERT INTO parent_child (parent_id, child_id) VALUES (1, 2); INSERT > INTO parent_child (parent_id, child_id) VALUES (1, 3); INSERT INTO > parent_child (parent_id, child_id) VALUES (1, 4); INSERT INTO > parent_child (parent_id, child_id) VALUES (2, 5); INSERT INTO > parent_child (parent_id, child_id) VALUES (2, 6); INSERT INTO > parent_child (parent_id, child_id) VALUES (4, 7); INSERT INTO > parent_child (parent_id, child_id) VALUES (4, 8); INSERT INTO > parent_child (parent_id, child_id) VALUES (4, 9); INSERT INTO > parent_child (parent_id, child_id) VALUES (9, 10); > > This produces the following tree of data: > > 1 > ___|___ > | | | > 2 3 4 > _|_ _|_ > | | | | | > 5 6 7 8 9 > | > 10 > > I want to create a function that returns the terminating nodes of of > this tree below a certain level i.e. if I input 1 to the function I > need it to return 5,6,3,7,8,10. If I input 4 to the function I would > get 7,8,10. I have written recursive functions which return all nodes > on a branch of a tree but I can't think of a way to return the end > nodes does anyone know of a solution? > I haven't programmed in PL/pgSQL in a while, but I'll write some pseudo code. I think the code should be similar: func(int node) { dynamic_array s; dynamic_array leaves; int top, count, leaf_id, popped, child; leaf_id = top = 0; s[top] = node; count = 1; // to a depth first search while(count != 0) { popped = s[top]; top--; count--; foreach(select pc.child_id into child from parent_child pc where pc.parent_id = popped) { select* from parect_child pc where parent_id = child; // a count of zero indicates that child node has no children if(count_of_above_query = 0) { leaves[leaf_id] = child; leaf_id++; } else { // not a leaf, so add it to the stackfor the next time through // the loop top++; s[top] = child; count++; } } } return leaves; } Regards, Yasir