Thread: Processing Tables containing tree-like data
Hi, I have a table which looks like this: id info parentid 0 <God> 0 1 Adam 0 2 Cain 1 3 Abel 1 4 Seth 1 5 Enosh 4 .... I am looking for a fast and efficient way of finding ALL the descendents of any particular node, to unlimited depth. Is there a standard database trick for doing this efficiently ? Writing a recursive function would be extremely inefficient for repeated queries. Thanks, Netzach
--- psql-novice@netzach.co.il wrote: > > Hi, > > I have a table which looks like this: > > id info parentid > 0 <God> 0 > 1 Adam 0 > 2 Cain 1 > 3 Abel 1 > 4 Seth 1 > 5 Enosh 4 > .... > This model is known as the Adjacency list tree model. there are two other with different strengths and weaknesses. They are Path Enumeration tree model, and the nested set tree model. Path Enumeration and nested set are better suited for this kind of query than the Adjacently list model. The simplest and best preforming queries are probably developed using the nested set model, but this poor choice if you are frequently adding new people to your table. The Path enumeration model doesn't have an insertion problem, however, this model looks very de-normalized. I would imagine that it queries can be slow since it relies on LIKE predicates to find all of the descendants of a particular node. On a side note, I believe that the PostgreSQL developers are working on adding a standard SQL operation that will handle this for your. Perhaps we could expect to see it in the next couple of years. path-enumeration http://www.onlamp.com/pub/a/onlamp/2004/08/05/hierarchical_sql.html nested-set http://www.dbmsmag.com/9604d06.html Regards, Richard Broersma Jr.
On May 29, 2007, at 11:58 , psql-novice@netzach.co.il wrote: > I have a table which looks like this: > > id info parentid > 0 <God> 0 > 1 Adam 0 > 2 Cain 1 > 3 Abel 1 > 4 Seth 1 > 5 Enosh 4 > .... > > > > I am looking for a fast and efficient way of finding ALL the > descendents > of any particular node, to unlimited depth. > > Is there a standard database trick for doing this efficiently ? > Writing > a recursive function would be extremely inefficient for repeated > queries. What you've got there is often called an adjacency list. Check out the connect_by function in the tablefunc contrib module (contrib/ tablefunc). You might also want to google for SQL nested sets to find an alternative method of doing handling trees. Michael Glaesemann grzm seespotcode net
What is the frequency that this data is updated? If the data is static (or if you can get away with running a cron job every now and then), you can write a recursive pl/pgslq function to get level information for each node on the tree and assign a specific "incremental" node_id for each record. Due to the nature of the recursive function, a node_id is assigned to the children of a specific node instead of its siblings. You should end up with data as illustrated below. id info parent_id level node_id 1 Name1 Null 1 1 2 Name2 1 2 2 3 Name3 2 3 3 4 Name4 3 4 4 5 Name5 4 5 5 6 Name5 1 2 6 7 Name6 6 3 7 8 Name7 1 2 8 Then you can simply retrieve the children of node (N) on level (L) with a single statement. SELECT * FROM table WHERE node_id > N AND node_id < (SELECT node_id FROM table WHERE level = L AND node_id > N ORDER BY node_id LIMIT 1 OFFSET 0); Refrain from using MIN() as performance suffers unbelievably. Hope this helps, Burak On 5/29/07, psql-novice@netzach.co.il <psql-novice@netzach.co.il> wrote: > > Hi, > > I have a table which looks like this: > > id info parentid > 0 <God> 0 > 1 Adam 0 > 2 Cain 1 > 3 Abel 1 > 4 Seth 1 > 5 Enosh 4 > .... > > > > I am looking for a fast and efficient way of finding ALL the descendents > of any particular node, to unlimited depth. > > Is there a standard database trick for doing this efficiently ? Writing > a recursive function would be extremely inefficient for repeated > queries. > > Thanks, > > > Netzach > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match >
Actually, you can even use this approach for frequently updated data. Here is the pseudo-code for a pl/pgsql function that adds a new node to the tree. * SELECT MAX(node_id) FROM table WHERE parent_id = node_parent * UPDATE table SET node_id=node_id+1 WHERE node_id > max_node_id * INSERT new node with node_id = max_node_id+1 and parent_id = node_parent Inserting new nodes into the chain would be more complex, but not impossible. Burak On 5/31/07, Burak Seydioglu <buraks78@gmail.com> wrote: > What is the frequency that this data is updated? > > If the data is static (or if you can get away with running a cron job > every now and then), you can write a recursive pl/pgslq function to > get level information for each node on the tree and assign a specific > "incremental" node_id for each record. Due to the nature of the > recursive function, a node_id is assigned to the children of a > specific node instead of its siblings. You should end up with data as > illustrated below. > > id info parent_id level node_id > 1 Name1 Null 1 1 > 2 Name2 1 2 2 > 3 Name3 2 3 3 > 4 Name4 3 4 4 > 5 Name5 4 5 5 > 6 Name5 1 2 6 > 7 Name6 6 3 7 > 8 Name7 1 2 8 > > Then you can simply retrieve the children of node (N) on level (L) > with a single statement. > > SELECT * FROM table WHERE node_id > N AND node_id < (SELECT node_id > FROM table WHERE level = L AND node_id > N ORDER BY node_id LIMIT 1 > OFFSET 0); > > Refrain from using MIN() as performance suffers unbelievably. > > Hope this helps, > > Burak > > > On 5/29/07, psql-novice@netzach.co.il <psql-novice@netzach.co.il> wrote: > > > > Hi, > > > > I have a table which looks like this: > > > > id info parentid > > 0 <God> 0 > > 1 Adam 0 > > 2 Cain 1 > > 3 Abel 1 > > 4 Seth 1 > > 5 Enosh 4 > > .... > > > > > > > > I am looking for a fast and efficient way of finding ALL the descendents > > of any particular node, to unlimited depth. > > > > Is there a standard database trick for doing this efficiently ? Writing > > a recursive function would be extremely inefficient for repeated > > queries. > > > > Thanks, > > > > > > Netzach > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 9: In versions below 8.0, the planner will ignore your desire to > > choose an index scan if your joining column's datatypes do not > > match > > >